Mercurial > hg > webaudioevaluationtool
comparison docs/WAC2016/WAC2016.tex @ 340:90b9b077475a WAC2016
Added screen shots, edited bib. Some edits to intro, abstract and conclusion.
author | Nicholas Jillings <nicholas.jillings@eecs.qmul.ac.uk> |
---|---|
date | Thu, 15 Oct 2015 21:17:14 +0100 |
parents | b52911d1df64 |
children | cccb3387d375 |
comparison
equal
deleted
inserted
replaced
339:b52911d1df64 | 340:90b9b077475a |
---|---|
124 % is the number that will appear on the first page PLUS the | 124 % is the number that will appear on the first page PLUS the |
125 % number that will appear in the \additionalauthors section. | 125 % number that will appear in the \additionalauthors section. |
126 | 126 |
127 \maketitle | 127 \maketitle |
128 \begin{abstract} | 128 \begin{abstract} |
129 Here comes the abstract. | 129 |
130 Perceptual listening tests are commonplace in audio research and a vital form of evaluation. Many tools exist to run such tests, however many operate one test type and are therefore limited whilst most require proprietary software. Using Web Audio the Web Audio Evaluation Tool (WAET) addresses these concerns by having one toolbox which can be configured to run many differen tests, perform it through a web browser and without needing proprietary software or computer programming knowledge. In this paper the role of the Web Audio API in giving WAET key functionalities are shown. The paper also highlights less common features, available to web based tools, such as easy remote testing environment and in-browser analytics. | |
131 | |
130 \end{abstract} | 132 \end{abstract} |
131 | 133 |
132 | 134 |
133 \section{Introduction} | 135 \section{Introduction} |
134 | 136 |
184 \end{table*} | 186 \end{table*} |
185 % | 187 % |
186 %Selling points: remote tests, visualisaton, create your own test in the browser, many interfaces, few/no dependencies, flexibility | 188 %Selling points: remote tests, visualisaton, create your own test in the browser, many interfaces, few/no dependencies, flexibility |
187 | 189 |
188 %[Talking about what we do in the various sections of this paper. Referring to \cite{waet}. ] | 190 %[Talking about what we do in the various sections of this paper. Referring to \cite{waet}. ] |
189 To meet the need for a cross-platform, versatile and easy-to-use listening test tool, we previously developed the Web Audio Evaluation Tool \cite{waet} which at the time of its inception was capable of running a listening test in the browser from an XML configuration file, and storing an XML file as well, with one particular interface. We have now expanded this into a tool with which a wide range of listening test types can easily be constructed and set up remotely, without any need for manually altering code or configuration files, and which allows visualisation of the collected results in the browser. In this paper, we discuss these different aspects and explore which future improvements would be possible. Specifically, in Section \ref{sec:architecture} we cover the general implementation aspects, with a focus on the Web Audio API, followed by a discussion of the requirements for successful remote tests in Section \ref{sec:remote}. Section \ref{sec:interfaces} describes the various interfaces the tool supports, as well as how to keep this manageable. Finally, in Section \ref{sec:analysis} we provide an overview of the analysis capabilities in the browser, before summarising our findings and listing future research directions in Section \ref{sec:conclusion}. | 191 To meet the need for a cross-platform, versatile and easy-to-use listening test tool, we previously developed the Web Audio Evaluation Tool \cite{waet} which at the time of its inception was capable of running a listening test in the browser from an XML configuration file, and storing an XML file as well, with one particular interface. We have now expanded this into a tool with which a wide range of listening test types can easily be constructed and set up remotely, without any need for manually altering code or configuration files, and which allows visualisation of the collected results in the browser. In this paper, we discuss these different aspects and explore which future improvements would be possible. |
190 | 192 |
191 \begin{figure}[tb] | 193 \begin{figure}[tb] |
192 \centering | 194 \centering |
193 \includegraphics[width=.5\textwidth]{interface.png} | 195 \includegraphics[width=.5\textwidth]{interface.png} |
194 \caption{A simple example of a multi-stimulus, single attribute, single rating scale test with a reference and comment fields.} | 196 \caption{A simple example of a multi-stimulus, single attribute, single rating scale test with a reference and comment fields.} |
229 | 231 |
230 Although WAET uses a sparse subset of the Web Audio API functionality, its performance comes directly from using it. Listening tests can convey large amounts of information other than obtaining the perceptual relationship between the audio fragments. With WAET it is possible to obtain which parts of the audio fragments were listened to and when, at what point in the audio stream the participant switched to a different fragment, and how a fragment's rating was adjusted over time within a session, to name a few. Not only does this allow evaluation of a wealth of perceptual aspects, but it helps detect poor participants whose results are potentially not representative. | 232 Although WAET uses a sparse subset of the Web Audio API functionality, its performance comes directly from using it. Listening tests can convey large amounts of information other than obtaining the perceptual relationship between the audio fragments. With WAET it is possible to obtain which parts of the audio fragments were listened to and when, at what point in the audio stream the participant switched to a different fragment, and how a fragment's rating was adjusted over time within a session, to name a few. Not only does this allow evaluation of a wealth of perceptual aspects, but it helps detect poor participants whose results are potentially not representative. |
231 | 233 |
232 One of the key initial design parameters for WAET was to make the tool as open as possible to non-programmers and to this end all of the user modifiable options are included in a single XML document. This document is called the specification document and can be designed either by manually writing the XML (or modifying an existing document or template) or using our included test creator. These are standalone HTML pages which do not require any server or internet connection and help a build the test specification document. The first (test\_create.html) is for simpler tests and operates step-by-step to guide the user. It supports media through drag and drop and a clutter free interface. The advanced version is for more advanced tests where raw XML manipulation is not wanted but the same freedom is required (whilst keeping a safety net). Both models support automatic verification to ensure the XML file is valid and will highlight areas which are either incorrect and would cause an error, or options which should be removed as they are blank. | 234 One of the key initial design parameters for WAET was to make the tool as open as possible to non-programmers and to this end all of the user modifiable options are included in a single XML document. This document is called the specification document and can be designed either by manually writing the XML (or modifying an existing document or template) or using our included test creator. These are standalone HTML pages which do not require any server or internet connection and help a build the test specification document. The first (test\_create.html) is for simpler tests and operates step-by-step to guide the user. It supports media through drag and drop and a clutter free interface. The advanced version is for more advanced tests where raw XML manipulation is not wanted but the same freedom is required (whilst keeping a safety net). Both models support automatic verification to ensure the XML file is valid and will highlight areas which are either incorrect and would cause an error, or options which should be removed as they are blank. |
233 | 235 |
234 The basic test creator utilises the Web Audio API to perform quick playback checks and also allows for loudness normalisation techniques inspired from \cite{ape}. These are calculated offline by accessing the raw audio samples exposed from the buffer before being applied to the audio element as a gain attribute. This is used in the test to perform loudness normalisation without needing to edit any audio files. Equally the gain can be modified in either editor using an HTML5 slider or number box. | 236 The basic test creator, Figre \ref{fig:test_create}, utilises the Web Audio API to perform quick playback checks and also allows for loudness normalisation techniques inspired from \cite{ape}. These are calculated offline by accessing the raw audio samples exposed from the buffer before being applied to the audio element as a gain attribute. This is used in the test to perform loudness normalisation without needing to edit any audio files. Equally the gain can be modified in either editor using an HTML5 slider or number box. |
237 | |
238 \begin{figure}[h!] | |
239 \centering | |
240 \includegraphics[width=.45\textwidth]{test_create_2.png} | |
241 \caption{Screen-shot of test creator tool using drag and drop to create specification document} | |
242 \label{fig:test_create} | |
243 \end{figure} | |
235 | 244 |
236 %Describe and/or visualise audioholder-audioelement-... structure. | 245 %Describe and/or visualise audioholder-audioelement-... structure. |
237 The specification document contains the URL of the audio fragments for each test page. These fragments are downloaded asynchronously in the test and decoded offline by the Web Audio offline decoder. The resulting buffers are assigned to a custom Audio Objects node which tracks the fragment buffer, the playback bufferSourceNode, the XML information including its unique test ID, the interface object(s) associated with the fragment and any metric or data collection objects. The Audio Object is controlled by an over-arching custom Audio Context node (not to be confused with the Web Audio Context). This parent JS Node allows for session wide control of the Audio Objects including starting and stopping playback of specific nodes. | 246 The specification document contains the URL of the audio fragments for each test page. These fragments are downloaded asynchronously in the test and decoded offline by the Web Audio offline decoder. The resulting buffers are assigned to a custom Audio Objects node which tracks the fragment buffer, the playback bufferSourceNode, the XML information including its unique test ID, the interface object(s) associated with the fragment and any metric or data collection objects. The Audio Object is controlled by an over-arching custom Audio Context node (not to be confused with the Web Audio Context). This parent JS Node allows for session wide control of the Audio Objects including starting and stopping playback of specific nodes. |
238 | 247 |
239 The only issue with this model is the bufferNode in the Web Audio API, which is implemented in the standard as a `use once' object. Once the bufferNode has been played, the bufferNode must be discarded as it cannot be instructed to play the same bufferSourceNode again. Therefore on each start request the buffer object must be created and then linked with the stored bufferSourceNode. This is an odd behaviour for such a simple object which has no alternative except to use the HTML5 audio element. However they do not have the ability to synchronously start on a given time and therefore not suited. | 248 The only issue with this model is the bufferNode in the Web Audio API, which is implemented in the standard as a `use once' object. Once the bufferNode has been played, the bufferNode must be discarded as it cannot be instructed to play the same bufferSourceNode again. Therefore on each start request the buffer object must be created and then linked with the stored bufferSourceNode. This is an odd behaviour for such a simple object which has no alternative except to use the HTML5 audio element. However they do not have the ability to synchronously start on a given time and therefore not suited. |
388 %%%% \item 2 point Scale - Better or Worse - (not sure how to default this - they default everything to better, which is an interesting choice) | 397 %%%% \item 2 point Scale - Better or Worse - (not sure how to default this - they default everything to better, which is an interesting choice) |
389 %%%% \end{itemize} | 398 %%%% \end{itemize} |
390 %%%% \end{itemize} | 399 %%%% \end{itemize} |
391 | 400 |
392 % Build your own test | 401 % Build your own test |
402 | |
393 \begin{comment} | 403 \begin{comment} |
394 { \bf A screenshot would be nice. | 404 { \bf A screenshot would be nice. |
395 | 405 |
396 Established tests (see below) included as `presets' in the build-your-own-test page. } | 406 Established tests (see below) included as `presets' in the build-your-own-test page. } |
397 \end{comment} | 407 \end{comment} |
426 \section{Concluding remarks and future work} | 436 \section{Concluding remarks and future work} |
427 \label{sec:conclusion} | 437 \label{sec:conclusion} |
428 | 438 |
429 The code and documentation can be pulled or downloaded from our online repository available at \url{code.soundsoftware.ac.uk/projects/webaudioevaluationtool}. | 439 The code and documentation can be pulled or downloaded from our online repository available at \url{code.soundsoftware.ac.uk/projects/webaudioevaluationtool}. |
430 | 440 |
431 [Talking a little bit about what else might happen. Unless we really want to wrap this up. ] | |
432 | |
433 \cite{schoeffler2015mushra} gives a `checklist' for subjective evaluation of audio systems. The Web Audio Evaluation Toolbox meets most of its given requirements including remote testing, crossfading between audio streams, collecting browser information, utilising UI elements and working with various audio formats including uncompressed PCM or WAV format. | 441 \cite{schoeffler2015mushra} gives a `checklist' for subjective evaluation of audio systems. The Web Audio Evaluation Toolbox meets most of its given requirements including remote testing, crossfading between audio streams, collecting browser information, utilising UI elements and working with various audio formats including uncompressed PCM or WAV format. |
434 % remote | 442 % remote |
435 % language support (not explicitly stated) | 443 % language support (not explicitly stated) |
436 % crossfades | 444 % crossfades |
437 % choosing speakers/sound device from within browser? --- NOT POSSIBLE, can only determine channel output counts and its up to the hardware to determine | 445 % choosing speakers/sound device from within browser? --- NOT POSSIBLE, can only determine channel output counts and its up to the hardware to determine |
438 % collect information about software and sound system | 446 % collect information about software and sound system |
439 % buttons, scales, ... UI elements | 447 % buttons, scales, ... UI elements |
440 % must be able to load uncompressed PCM | 448 % must be able to load uncompressed PCM |
441 | 449 |
450 The use of the Web Audio API is therefore key to WAET to meeting these requirements and others for performing perceptual evaluation tests. Along with the power of the HTML DOM environment giving the ability to interact with all on-page elements creates a powerful and flexible tool capable of performing a multitude of tests out of the box, whilst other tests could easily be built on top of the framework provided. | |
451 \begin{comment} | |
442 [What can we not do? `Method of adjustment', as in \cite{schoeffler2015mushra} is another can of worms, because, like, you could adjust lots of things (volume is just one of them, that could be done quite easily). Same for using input signals like the participant's voice. Either leave out, or mention this requires modification of the code we provide.] | 452 [What can we not do? `Method of adjustment', as in \cite{schoeffler2015mushra} is another can of worms, because, like, you could adjust lots of things (volume is just one of them, that could be done quite easily). Same for using input signals like the participant's voice. Either leave out, or mention this requires modification of the code we provide.] |
453 \end{comment} | |
443 | 454 |
444 % | 455 % |
445 % The following two commands are all you need in the | 456 % The following two commands are all you need in the |
446 % initial runs of your .tex file to | 457 % initial runs of your .tex file to |
447 % produce the bibliography for the citations in your paper. | 458 % produce the bibliography for the citations in your paper. |
448 \bibliographystyle{abbrv} | 459 \bibliographystyle{ieeetr} |
449 \bibliography{WAC2016} % sigproc.bib is the name of the Bibliography in this case | 460 \bibliography{WAC2016} % sigproc.bib is the name of the Bibliography in this case |
450 % You must have a proper ".bib" file | 461 % You must have a proper ".bib" file |
451 % and remember to run: | 462 % and remember to run: |
452 % latex bibtex latex latex | 463 % latex bibtex latex latex |
453 % to resolve all references | 464 % to resolve all references |