webaudioevaluationtool: docs/SMC15/smc2015template.tex comparison

comparison docs/SMC15/smc2015template.tex @ 1732:7435309fd918

Paper: Setup, output, conclusion, bibliography

author	Brecht De Man <b.deman@qmul.ac.uk>
date	Mon, 27 Apr 2015 22:00:22 +0100
parents	a596b1cbed6f
children	e9e182543f99

comparison

equal deleted inserted replaced

-:a596b1cbed6f
+:7435309fd918
 \item \texttt{index.html}: The main index file to load the scripts, this is the file the browser must request to load.
 \item \texttt{core.js}: Contains global functions and object prototypes to define the audio playback engine, audio objects and loading media files
 \item \texttt{ape.js}: Parses setup files to create the interface as instructed, following the same style chain as the MATLAB APE Tool \cite{deman2014b}.
 \end{itemize}
-The HTML file loads the \texttt{core.js} file along with a few other ancillary files (such as the jQuery JavaScript extensions\footnote{http://jquery.com/}), at which point the browser JavaScript begins to execute the on-page instructions, which gives the URL of the test set up XML document (outlined in Section \ref{sec:setupresultsformats}). \texttt{core.js} parses this document and executes the functions in \texttt{ape.js} to build the web page. The reason for separating these two files is to allow for further interface designs (such as MUSHRA \cite{mushra} or AB tests \cite{bech}) to be used, which would still require the same underlying core functions outlined in \texttt{core.js}.
+The HTML file loads the \texttt{core.js} file along with a few other ancillary files (such as the jQuery JavaScript extensions\footnote{http://jquery.com/}), at which point the browser JavaScript begins to execute the on-page instructions, which gives the URL of the test setup XML document (outlined in Section \ref{sec:setupresultsformats}). \texttt{core.js} parses this document and executes the functions in \texttt{ape.js} to build the web page. The reason for separating these two files is to allow for further interface designs (such as MUSHRA \cite{mushra} or AB tests \cite{bech}) to be used, which would still require the same underlying core functions outlined in \texttt{core.js}.
 The \texttt{ape.js} file has several main functions but the most important are documented here. \textit{loadInterface(xmlDoc)} is called to decode the supplied project document in respect for the interface specified and define any global structures (such as the slider interface). It also identifies the number of pages in the test and randomises the order, if specified to do so. This is the only mandatory function in any of the interface files as this is called by \texttt{core.js} when the document is ready. \texttt{core.js} cannot 'see' any interface specific functions and therefore cannot assume any are available. Therefore \textit{loadInterface(xmlDoc)} is essential to set up the entire test environment. Because the interface files are loaded by \texttt{core.js} and because the functions in \texttt{core.js} are global, the interface files can `see' the \texttt{core.js} file and can therefore not only interact with it, but also modify it.
 Each test page is loaded using \textit{loadTest(id)} which performs two major tasks: to populate the interface with the slider elements and comment boxes; and secondly to instruct the \textit{audioEngine} to load the audio fragments and construct the backend audio graph. \textit{loadTest(id)} also instructs the audio engine in \texttt{core.js} to create the \textit{audioObject}.
 These are custom audio nodes, one representing each audio element specified in each page.
 Browsers support various audio file formats and are not consistent in any format. Currently the Web Audio API is best supported in Chrome, Firefox, Opera and Safari. All of these support the use of the uncompressed WAV format. Although not a compact, web friendly format, most transport systems are of a high enough bandwidth this should not be a problem. Ogg Vorbis is another well supported format across the four supported major desktop browsers, as well as MP3 (although Firefox may not support all MP3 types) \footnote{https://developer.mozilla.org/en-US/docs/Web/HTML/\\Supported\_media\_formats}. %https://developer.mozilla.org/en-US/docs/Web/HTML/Supported_media_formats
 One issue of the Web Audio API is that the sample rate is assigned by the system sound device, rather than requested and does not have the ability to request a different one. % Does this make sense? The problem is across all audio files.
-As the sampling rate and the effect of resampling may be critical for some listening tests, the default operation when an audio file is loaded with a different sample rate to that of the system is to convert the sample rate. To provide a check for this, the desired sample rate can be supplied with the set up XML and checked against. If the sample rates do not match, a browser alert window is shown asking for the sample rate to be correctly adjusted.
+As the sampling rate and the effect of resampling may be critical for some listening tests, the default operation when an audio file is loaded with a different sample rate to that of the system is to convert the sample rate. To provide a check for this, the desired sample rate can be supplied with the setup XML and checked against. If the sample rates do not match, a browser alert window is shown asking for the sample rate to be correctly adjusted.
 This happens before any loading or decoding of audio files so the browser will only be instructed to fetch files if the system sample rate meets the requirements, avoiding multiple requests for large files until they are actually needed.
 %During playback, the playback nodes loop indefinitely until playback is stopped. The gain nodes in the \textit{audioObject}s enable dynamic muting of nodes. When a bar in the sliding ranking is clicked, the audio engine mutes all \textit{audioObject}s and un-mutes the clicked one. Therefore, if the audio samples are perfectly aligned up and of the same sample length, they will remain perfectly aligned with each other.
 % Don't think this is relevant anymore
 \section{Input and result files}\label{sec:setupresultsformats}
-The set up and result files both use the common XML document format to outline the various parameters. The set up file determines which interface to use, the location of audio files, how many pages and other parameters to define the testing environment. Having one document to modify allows for quick manipulation in a `human readable' form to create new tests, or adjust current ones, without needing to edit multiple web files. An example of this XML document is presented in Figure~\ref{fig:xmlIn}% I mean the .js and .html files, though not sure if any better.
+The setup and result files both use the common XML document format to outline the various parameters. The setup file determines the interface to use, the location of audio files, the number of pages and other parameters to define the testing environment. Having one document to modify allows for quick manipulation in a `human readable' form to create new tests, or adjust current ones, without needing to edit multiple web files. Furthermore, we also provide a simple web page to enter all these settings without needing to manipulate the raw XML. An example of this XML document is presented in Figure~\ref{fig:xmlIn}. % I mean the .js and .html files, though not sure if any better.
-\subsection{Set up and configurability}
+\subsection{Setup and configurability}
 \begin{figure}[ht]
 \begin{center}
 \includegraphics[width=0.5\textwidth]{XMLInput2.png}
-\caption{An Example Input XML File}
+\caption{An example input XML file}
 \label{fig:xmlIn}
 \end{center}
 \end{figure}
-The set up document has several defined nodes and structure which are documented with the source code. For example there is a section for general set up options where any pre-test and post-test questions and statements can be defined. Pre- and post-test dialogue boxes allow for comments or questions to be presented before or after the test, to convey listening test instructions, gather information about the subject, listening environment, and overall experience of the test. From the example in Figure~\ref{fig:xmlIn}, it can be seen that a question box should be generated, with the id 'location' and it is mandatory to answer. The question is in the PreTest node meaning it will appear before any testing will begin. When the result for the  entire test is shown, the response will appear in the PreTest node with the id 'location' allowing it to be found easily. This outlines the importance of having clear and meaningful ID values.
+The setup document has several defined nodes and structure which are documented with the source code. For example, there is a section for general setup options where any pre-test and post-test questions and statements can be defined. Pre- and post-test dialogue boxes allow for comments or questions to be presented before or after the test, to convey listening test instructions, and gather information about the subject, listening environment, and overall experience of the test. In the example in Figure~\ref{fig:xmlIn}, a question box with the id `location' is added, which is set to be mandatory to answer. The question is in the PreTest node meaning it will appear before any testing will begin. When the result for the  entire test is shown, the response will appear in the PreTest node with the id `location' allowing it to be found easily, provided the id values are meaningful.
-We try to cater to a diverse audience with this toolbox, while ensuring the toolbox is simple, elegant and straightforward. To that end, we include the following options that can be easily switched on and off, by setting the value in the input XML file.
+We try to cater to a diverse audience with this toolbox, while ensuring it is simple, elegant and straightforward. To that end, we currently include the following options that can be easily switched on and off, by setting the value in the input XML file.
 \begin{itemize} %Should have used a description list for this.
 \item \textbf{Snap to corresponding position}: When this is enabled, and a fragment is playing, the playhead skips to the same position in the next fragment that is clicked. If it is not enabled, every fragment is played from the start.
 \item \textbf{Loop fragments}: Repeat current fragment when end is reached, until the `Stop audio' or `Submit' button is clicked.
 \item \textbf{Comments}: Displays a separate comment box for each fragment in the page.
 \item \textbf{Randomise fragment order}: Randomises the order and numbering of the markers and comment boxes corresponding with the fragments. This permutation is stored as well, to be able to interpret references to the numbers in the comments (such as `this is much [brighter] then 4').
 \item \textbf{Require playback}: Require that each fragment has been played at least once, if not in full.
 \item \textbf{Require full playback}: If `Require playback' is active, require that each fragment has been played in full.
 \item \textbf{Require moving}: Require that each marker is moved (dragged) at least once.
 \item \textbf{Require comments}: This option allows requiring the subject to require a comment for each track.
-\item \textbf{Repeat test}: Number of times test should be repeated (none by default), to allow familiarisation with the content and experiment, and to investigate consistency of user and variability due to familiarity. In the set up, each 'page' can be given a repeat count. These are all gathered before shuffling the order so repeated tests are not back-to-back if possible.
+\item \textbf{Repeat test}: Number of times each page in the test should be repeated (none by default), to allow familiarisation with the content and experiment, and to investigate consistency of user and variability due to familiarity. In the setup, each 'page' can be given a repeat count. These are all gathered before shuffling the order so repeated tests are not back-to-back if possible.
 \item \textbf{Returning to previous pages}: Indicates whether it is possible to go back to a previous `page' in the test.
 \item \textbf{Lowest rating below [value]}: To enforce a certain use of the rating scale, it can be required to rate at least one sample below a specified value.
 \item \textbf{Highest rating above [value]}: To enforce a certain use of the rating scale, it can be required to rate at least one sample above a specified value.
 \item \textbf{Reference}: Allows for a separate sample (outside of the axis) to be the `reference', which the subject can play back during the test to help with the task at hand \cite{mushra}.
 \item \textbf{Hidden reference}: Whether or not an explicit `reference' is provided, the `hidden reference' should be rated above a certain value \cite{mushra} - this can be enforced.
 \item \textbf{Hidden anchor}: The `hidden anchor' should be rated lower than a certain value \cite{mushra} - this can be enforced.
 \item \textbf{Show scrub bar}: Display a playhead on a scrub bar to show the position in the current fragment.
 \item \textbf{Drag playhead}: If scrub bar is visible, allow dragging to move back or forward in a fragment.
 \end{itemize}
-When one of these options is not included in the set up file, they assume a default value. As a result, the input file can be kept very compact if default values suffice for the test.
+When one of these options is not included in the setup file, they assume a default value. As a result, the input file can be kept very compact if default values suffice for the test.
 % loop, snap to corresponding position, comments, 'general' comment, require same sampling rate, different types of randomisation
 \subsection{Results}
-The results file is dynamically generated by the interface upon clicking the `Submit' button. This also executes checks, depending on the set up file, to ensure that all tracks have been played back, rated and commented on. The XML output returned contains a node per audioObject and contains both the corresponding marker's position and any comments written in the associated comment box. The rating returned is normalised to be a value between 0 and 1, normalising the pixel representation of different browser windows. An example output file is presented in Figure~\ref{fig:xmlOut}
+The results file is dynamically generated by the interface upon clicking the `Submit' button. This also executes checks, depending on the setup file, to ensure that all tracks have been played back, rated and commented on. The XML output returned contains a node per audioObject and contains both the corresponding marker's position and any comments written in the associated comment box. The rating returned is normalised to be a value between 0 and 1, normalising the pixel representation of different browser windows. An example output file is presented in Figure~\ref{fig:xmlOut}.
 \begin{figure}[ht]
 \begin{center}
 \includegraphics[width=0.5\textwidth]{XMLOutput2.png}
-\caption{An Example Output XML File}
+\caption{An example output XML file}
 \label{fig:xmlOut}
 \end{center}
 \end{figure}
-The results also contain information collected by any defined pre/post questions. These are referenced against the set up XML by using the same ID so readable responses can be obtained. Taking from the earlier example of setting up a pre-test question, an example response can be seen in Figure \ref{fig:xmlOut}.
+The results also contain information collected by any defined pre/post questions. These are referenced against the setup XML by using the same ID so readable responses can be obtained. Taking from the earlier example of setting up a pre-test question, an example response can be seen in Figure \ref{fig:xmlOut}.
 Each page of testing is returned with the results of the entire page included in the structure. One `audioElement' node is created per audio fragment per page, along with its ID. This includes several child nodes including the rating between 0 and 1, the comment, and any other collected metrics including how long the element was listened for, the initial position, boolean flags if the element was listened to, if the element was moved and if the element comment box had any comment. Furthermore, each user action (manipulation of any interface element, such as playback or moving a marker) can be logged along with a the corresponding time code.
-Furthermore, we also store session data such as the browser the tool was used in.
+We also store session data such as the browser the tool was used in.
 We provide the option to store the results locally, and/or to have them sent to a server.
 %Here is an example of the set up XML and the results XML: % perhaps best to refer to each XML after each section (set up <> results)
 % Should we include an Example of the input and output XML structure?? --> Sure.
 %<metricresult id="elementFlagListenedTo"> true< /metricresult> \\
 %<metricresult id="elementFlagMoved"> true </metricresult> \\
 %</metric> \\
 %</audioelement>}
-The parent tag \texttt{audioelement} holds the ID of the element passed in from the set up document. The first child element is \texttt{comment} and holds both the question shown and the response from the comment box inside.
+The parent tag \texttt{audioelement} holds the ID of the element passed in from the setup document. The first child element is \texttt{comment} and holds both the question shown and the response from the comment box inside.
-The child element \texttt{value} holds the normalised ranking value. Next comes the metric node structure, there is one metric result node per metric event collected. The id of the node identifies the type of data it contains. For example, the first holds the id \textit{elementTimer} and the data contained represents how long, in seconds, the audio element was listened to. There is one \texttt{audioelement} tag per audio element on each test page.
+The child element \texttt{value} holds the normalised ranking value. Next comes the metric node structure, with one metric result node per metric event collected. The id of the node identifies the type of data it contains. For example, the first holds the id \textit{elementTimer} and the data contained represents how long, in seconds, the audio element was listened to. There is one \texttt{audioelement} tag per audio element on each test page.
 \section{Conclusions and future work}\label{sec:conclusions}
 In this paper we have presented an approach to creating a browser-based listening test environment that can be used for a variety of types of perceptual evaluation of audio.
 Specifically, we discussed the use of the toolbox in the context of assessment of preference for different production practices, with identical source material.
-The purpose of this paper is to outline the design of this tool, to describe our implementation using basic HTML5 functionality, and to discuss design challenges and limitations of our approach. This tool differentiates itself from other perceptual audio tools by enabling web technologies for multiple participants to perform the test without the need for proprietary software such as MATLAB. The tool also allows for any interface to be built using HTML5 elements to create dynamic interfaces built either on current evaluation techniques or newer techniques. It enables quick set up of simple tests with the ability to manage complex tests through a single file. And finally it uses the XML document format to store the results allowing for processing and analysis of results in various third party software such as MATLAB or Python.
+The purpose of this paper is to outline the design of this tool, to describe our implementation using basic HTML5 functionality, and to discuss design challenges and limitations of our approach. This tool differentiates itself from other perceptual audio tools by enabling web technologies for multiple participants to perform the test without the need for proprietary software such as MATLAB. The tool also allows for any interface to be built using HTML5 elements to create a variety of dynamic, multiple-stimulus listening test interfaces. It enables quick setup of simple tests with the ability to manage complex tests through a single file. Finally it uses the XML document format to store the results allowing for processing and analysis of results in various third party software such as MATLAB or Python.
 % future work
 Further work may include the development of other common test designs, such as MUSHRA \cite{mushra}, 2D valence and arousal rating, and others. We will add functionality to assist with setting up large-scale tests with remote subjects, so this becomes straightforward and intuitive.
 In addition, we will keep on improving and expanding the tool, and highly welcome feedback and contributions from the community.

Mercurial > hg > webaudioevaluationtool

comparison docs/SMC15/smc2015template.tex @ 1732:7435309fd918