comparison docs/SMC15/smc2015template.tex @ 1732:7435309fd918

Paper: Setup, output, conclusion, bibliography
author Brecht De Man <b.deman@qmul.ac.uk>
date Mon, 27 Apr 2015 22:00:22 +0100
parents a596b1cbed6f
children e9e182543f99
comparison
equal deleted inserted replaced
1731:a596b1cbed6f 1732:7435309fd918
254 \item \texttt{index.html}: The main index file to load the scripts, this is the file the browser must request to load. 254 \item \texttt{index.html}: The main index file to load the scripts, this is the file the browser must request to load.
255 \item \texttt{core.js}: Contains global functions and object prototypes to define the audio playback engine, audio objects and loading media files 255 \item \texttt{core.js}: Contains global functions and object prototypes to define the audio playback engine, audio objects and loading media files
256 \item \texttt{ape.js}: Parses setup files to create the interface as instructed, following the same style chain as the MATLAB APE Tool \cite{deman2014b}. 256 \item \texttt{ape.js}: Parses setup files to create the interface as instructed, following the same style chain as the MATLAB APE Tool \cite{deman2014b}.
257 \end{itemize} 257 \end{itemize}
258 258
259 The HTML file loads the \texttt{core.js} file along with a few other ancillary files (such as the jQuery JavaScript extensions\footnote{http://jquery.com/}), at which point the browser JavaScript begins to execute the on-page instructions, which gives the URL of the test set up XML document (outlined in Section \ref{sec:setupresultsformats}). \texttt{core.js} parses this document and executes the functions in \texttt{ape.js} to build the web page. The reason for separating these two files is to allow for further interface designs (such as MUSHRA \cite{mushra} or AB tests \cite{bech}) to be used, which would still require the same underlying core functions outlined in \texttt{core.js}. 259 The HTML file loads the \texttt{core.js} file along with a few other ancillary files (such as the jQuery JavaScript extensions\footnote{http://jquery.com/}), at which point the browser JavaScript begins to execute the on-page instructions, which gives the URL of the test setup XML document (outlined in Section \ref{sec:setupresultsformats}). \texttt{core.js} parses this document and executes the functions in \texttt{ape.js} to build the web page. The reason for separating these two files is to allow for further interface designs (such as MUSHRA \cite{mushra} or AB tests \cite{bech}) to be used, which would still require the same underlying core functions outlined in \texttt{core.js}.
260 260
261 The \texttt{ape.js} file has several main functions but the most important are documented here. \textit{loadInterface(xmlDoc)} is called to decode the supplied project document in respect for the interface specified and define any global structures (such as the slider interface). It also identifies the number of pages in the test and randomises the order, if specified to do so. This is the only mandatory function in any of the interface files as this is called by \texttt{core.js} when the document is ready. \texttt{core.js} cannot 'see' any interface specific functions and therefore cannot assume any are available. Therefore \textit{loadInterface(xmlDoc)} is essential to set up the entire test environment. Because the interface files are loaded by \texttt{core.js} and because the functions in \texttt{core.js} are global, the interface files can `see' the \texttt{core.js} file and can therefore not only interact with it, but also modify it. 261 The \texttt{ape.js} file has several main functions but the most important are documented here. \textit{loadInterface(xmlDoc)} is called to decode the supplied project document in respect for the interface specified and define any global structures (such as the slider interface). It also identifies the number of pages in the test and randomises the order, if specified to do so. This is the only mandatory function in any of the interface files as this is called by \texttt{core.js} when the document is ready. \texttt{core.js} cannot 'see' any interface specific functions and therefore cannot assume any are available. Therefore \textit{loadInterface(xmlDoc)} is essential to set up the entire test environment. Because the interface files are loaded by \texttt{core.js} and because the functions in \texttt{core.js} are global, the interface files can `see' the \texttt{core.js} file and can therefore not only interact with it, but also modify it.
262 262
263 Each test page is loaded using \textit{loadTest(id)} which performs two major tasks: to populate the interface with the slider elements and comment boxes; and secondly to instruct the \textit{audioEngine} to load the audio fragments and construct the backend audio graph. \textit{loadTest(id)} also instructs the audio engine in \texttt{core.js} to create the \textit{audioObject}. 263 Each test page is loaded using \textit{loadTest(id)} which performs two major tasks: to populate the interface with the slider elements and comment boxes; and secondly to instruct the \textit{audioEngine} to load the audio fragments and construct the backend audio graph. \textit{loadTest(id)} also instructs the audio engine in \texttt{core.js} to create the \textit{audioObject}.
264 These are custom audio nodes, one representing each audio element specified in each page. 264 These are custom audio nodes, one representing each audio element specified in each page.
271 271
272 272
273 273
274 Browsers support various audio file formats and are not consistent in any format. Currently the Web Audio API is best supported in Chrome, Firefox, Opera and Safari. All of these support the use of the uncompressed WAV format. Although not a compact, web friendly format, most transport systems are of a high enough bandwidth this should not be a problem. Ogg Vorbis is another well supported format across the four supported major desktop browsers, as well as MP3 (although Firefox may not support all MP3 types) \footnote{https://developer.mozilla.org/en-US/docs/Web/HTML/\\Supported\_media\_formats}. %https://developer.mozilla.org/en-US/docs/Web/HTML/Supported_media_formats 274 Browsers support various audio file formats and are not consistent in any format. Currently the Web Audio API is best supported in Chrome, Firefox, Opera and Safari. All of these support the use of the uncompressed WAV format. Although not a compact, web friendly format, most transport systems are of a high enough bandwidth this should not be a problem. Ogg Vorbis is another well supported format across the four supported major desktop browsers, as well as MP3 (although Firefox may not support all MP3 types) \footnote{https://developer.mozilla.org/en-US/docs/Web/HTML/\\Supported\_media\_formats}. %https://developer.mozilla.org/en-US/docs/Web/HTML/Supported_media_formats
275 One issue of the Web Audio API is that the sample rate is assigned by the system sound device, rather than requested and does not have the ability to request a different one. % Does this make sense? The problem is across all audio files. 275 One issue of the Web Audio API is that the sample rate is assigned by the system sound device, rather than requested and does not have the ability to request a different one. % Does this make sense? The problem is across all audio files.
276 As the sampling rate and the effect of resampling may be critical for some listening tests, the default operation when an audio file is loaded with a different sample rate to that of the system is to convert the sample rate. To provide a check for this, the desired sample rate can be supplied with the set up XML and checked against. If the sample rates do not match, a browser alert window is shown asking for the sample rate to be correctly adjusted. 276 As the sampling rate and the effect of resampling may be critical for some listening tests, the default operation when an audio file is loaded with a different sample rate to that of the system is to convert the sample rate. To provide a check for this, the desired sample rate can be supplied with the setup XML and checked against. If the sample rates do not match, a browser alert window is shown asking for the sample rate to be correctly adjusted.
277 This happens before any loading or decoding of audio files so the browser will only be instructed to fetch files if the system sample rate meets the requirements, avoiding multiple requests for large files until they are actually needed. 277 This happens before any loading or decoding of audio files so the browser will only be instructed to fetch files if the system sample rate meets the requirements, avoiding multiple requests for large files until they are actually needed.
278 278
279 %During playback, the playback nodes loop indefinitely until playback is stopped. The gain nodes in the \textit{audioObject}s enable dynamic muting of nodes. When a bar in the sliding ranking is clicked, the audio engine mutes all \textit{audioObject}s and un-mutes the clicked one. Therefore, if the audio samples are perfectly aligned up and of the same sample length, they will remain perfectly aligned with each other. 279 %During playback, the playback nodes loop indefinitely until playback is stopped. The gain nodes in the \textit{audioObject}s enable dynamic muting of nodes. When a bar in the sliding ranking is clicked, the audio engine mutes all \textit{audioObject}s and un-mutes the clicked one. Therefore, if the audio samples are perfectly aligned up and of the same sample length, they will remain perfectly aligned with each other.
280 % Don't think this is relevant anymore 280 % Don't think this is relevant anymore
281 281
282 282
283 \section{Input and result files}\label{sec:setupresultsformats} 283 \section{Input and result files}\label{sec:setupresultsformats}
284 284
285 The set up and result files both use the common XML document format to outline the various parameters. The set up file determines which interface to use, the location of audio files, how many pages and other parameters to define the testing environment. Having one document to modify allows for quick manipulation in a `human readable' form to create new tests, or adjust current ones, without needing to edit multiple web files. An example of this XML document is presented in Figure~\ref{fig:xmlIn}% I mean the .js and .html files, though not sure if any better. 285 The setup and result files both use the common XML document format to outline the various parameters. The setup file determines the interface to use, the location of audio files, the number of pages and other parameters to define the testing environment. Having one document to modify allows for quick manipulation in a `human readable' form to create new tests, or adjust current ones, without needing to edit multiple web files. Furthermore, we also provide a simple web page to enter all these settings without needing to manipulate the raw XML. An example of this XML document is presented in Figure~\ref{fig:xmlIn}. % I mean the .js and .html files, though not sure if any better.
286 286
287 \subsection{Set up and configurability} 287 \subsection{Setup and configurability}
288 288
289 \begin{figure}[ht] 289 \begin{figure}[ht]
290 \begin{center} 290 \begin{center}
291 \includegraphics[width=0.5\textwidth]{XMLInput2.png} 291 \includegraphics[width=0.5\textwidth]{XMLInput2.png}
292 \caption{An Example Input XML File} 292 \caption{An example input XML file}
293 \label{fig:xmlIn} 293 \label{fig:xmlIn}
294 \end{center} 294 \end{center}
295 \end{figure} 295 \end{figure}
296 296
297 The set up document has several defined nodes and structure which are documented with the source code. For example there is a section for general set up options where any pre-test and post-test questions and statements can be defined. Pre- and post-test dialogue boxes allow for comments or questions to be presented before or after the test, to convey listening test instructions, gather information about the subject, listening environment, and overall experience of the test. From the example in Figure~\ref{fig:xmlIn}, it can be seen that a question box should be generated, with the id 'location' and it is mandatory to answer. The question is in the PreTest node meaning it will appear before any testing will begin. When the result for the entire test is shown, the response will appear in the PreTest node with the id 'location' allowing it to be found easily. This outlines the importance of having clear and meaningful ID values. 297 The setup document has several defined nodes and structure which are documented with the source code. For example, there is a section for general setup options where any pre-test and post-test questions and statements can be defined. Pre- and post-test dialogue boxes allow for comments or questions to be presented before or after the test, to convey listening test instructions, and gather information about the subject, listening environment, and overall experience of the test. In the example in Figure~\ref{fig:xmlIn}, a question box with the id `location' is added, which is set to be mandatory to answer. The question is in the PreTest node meaning it will appear before any testing will begin. When the result for the entire test is shown, the response will appear in the PreTest node with the id `location' allowing it to be found easily, provided the id values are meaningful.
298 298
299 We try to cater to a diverse audience with this toolbox, while ensuring the toolbox is simple, elegant and straightforward. To that end, we include the following options that can be easily switched on and off, by setting the value in the input XML file. 299 We try to cater to a diverse audience with this toolbox, while ensuring it is simple, elegant and straightforward. To that end, we currently include the following options that can be easily switched on and off, by setting the value in the input XML file.
300 300
301 \begin{itemize} %Should have used a description list for this. 301 \begin{itemize} %Should have used a description list for this.
302 \item \textbf{Snap to corresponding position}: When this is enabled, and a fragment is playing, the playhead skips to the same position in the next fragment that is clicked. If it is not enabled, every fragment is played from the start. 302 \item \textbf{Snap to corresponding position}: When this is enabled, and a fragment is playing, the playhead skips to the same position in the next fragment that is clicked. If it is not enabled, every fragment is played from the start.
303 \item \textbf{Loop fragments}: Repeat current fragment when end is reached, until the `Stop audio' or `Submit' button is clicked. 303 \item \textbf{Loop fragments}: Repeat current fragment when end is reached, until the `Stop audio' or `Submit' button is clicked.
304 \item \textbf{Comments}: Displays a separate comment box for each fragment in the page. 304 \item \textbf{Comments}: Displays a separate comment box for each fragment in the page.
308 \item \textbf{Randomise fragment order}: Randomises the order and numbering of the markers and comment boxes corresponding with the fragments. This permutation is stored as well, to be able to interpret references to the numbers in the comments (such as `this is much [brighter] then 4'). 308 \item \textbf{Randomise fragment order}: Randomises the order and numbering of the markers and comment boxes corresponding with the fragments. This permutation is stored as well, to be able to interpret references to the numbers in the comments (such as `this is much [brighter] then 4').
309 \item \textbf{Require playback}: Require that each fragment has been played at least once, if not in full. 309 \item \textbf{Require playback}: Require that each fragment has been played at least once, if not in full.
310 \item \textbf{Require full playback}: If `Require playback' is active, require that each fragment has been played in full. 310 \item \textbf{Require full playback}: If `Require playback' is active, require that each fragment has been played in full.
311 \item \textbf{Require moving}: Require that each marker is moved (dragged) at least once. 311 \item \textbf{Require moving}: Require that each marker is moved (dragged) at least once.
312 \item \textbf{Require comments}: This option allows requiring the subject to require a comment for each track. 312 \item \textbf{Require comments}: This option allows requiring the subject to require a comment for each track.
313 \item \textbf{Repeat test}: Number of times test should be repeated (none by default), to allow familiarisation with the content and experiment, and to investigate consistency of user and variability due to familiarity. In the set up, each 'page' can be given a repeat count. These are all gathered before shuffling the order so repeated tests are not back-to-back if possible. 313 \item \textbf{Repeat test}: Number of times each page in the test should be repeated (none by default), to allow familiarisation with the content and experiment, and to investigate consistency of user and variability due to familiarity. In the setup, each 'page' can be given a repeat count. These are all gathered before shuffling the order so repeated tests are not back-to-back if possible.
314 \item \textbf{Returning to previous pages}: Indicates whether it is possible to go back to a previous `page' in the test. 314 \item \textbf{Returning to previous pages}: Indicates whether it is possible to go back to a previous `page' in the test.
315 \item \textbf{Lowest rating below [value]}: To enforce a certain use of the rating scale, it can be required to rate at least one sample below a specified value. 315 \item \textbf{Lowest rating below [value]}: To enforce a certain use of the rating scale, it can be required to rate at least one sample below a specified value.
316 \item \textbf{Highest rating above [value]}: To enforce a certain use of the rating scale, it can be required to rate at least one sample above a specified value. 316 \item \textbf{Highest rating above [value]}: To enforce a certain use of the rating scale, it can be required to rate at least one sample above a specified value.
317 \item \textbf{Reference}: Allows for a separate sample (outside of the axis) to be the `reference', which the subject can play back during the test to help with the task at hand \cite{mushra}. 317 \item \textbf{Reference}: Allows for a separate sample (outside of the axis) to be the `reference', which the subject can play back during the test to help with the task at hand \cite{mushra}.
318 \item \textbf{Hidden reference}: Whether or not an explicit `reference' is provided, the `hidden reference' should be rated above a certain value \cite{mushra} - this can be enforced. 318 \item \textbf{Hidden reference}: Whether or not an explicit `reference' is provided, the `hidden reference' should be rated above a certain value \cite{mushra} - this can be enforced.
319 \item \textbf{Hidden anchor}: The `hidden anchor' should be rated lower than a certain value \cite{mushra} - this can be enforced. 319 \item \textbf{Hidden anchor}: The `hidden anchor' should be rated lower than a certain value \cite{mushra} - this can be enforced.
320 \item \textbf{Show scrub bar}: Display a playhead on a scrub bar to show the position in the current fragment. 320 \item \textbf{Show scrub bar}: Display a playhead on a scrub bar to show the position in the current fragment.
321 \item \textbf{Drag playhead}: If scrub bar is visible, allow dragging to move back or forward in a fragment. 321 \item \textbf{Drag playhead}: If scrub bar is visible, allow dragging to move back or forward in a fragment.
322 \end{itemize} 322 \end{itemize}
323 323
324 When one of these options is not included in the set up file, they assume a default value. As a result, the input file can be kept very compact if default values suffice for the test. 324 When one of these options is not included in the setup file, they assume a default value. As a result, the input file can be kept very compact if default values suffice for the test.
325 325
326 % loop, snap to corresponding position, comments, 'general' comment, require same sampling rate, different types of randomisation 326 % loop, snap to corresponding position, comments, 'general' comment, require same sampling rate, different types of randomisation
327 327
328 \subsection{Results} 328 \subsection{Results}
329 329
330 The results file is dynamically generated by the interface upon clicking the `Submit' button. This also executes checks, depending on the set up file, to ensure that all tracks have been played back, rated and commented on. The XML output returned contains a node per audioObject and contains both the corresponding marker's position and any comments written in the associated comment box. The rating returned is normalised to be a value between 0 and 1, normalising the pixel representation of different browser windows. An example output file is presented in Figure~\ref{fig:xmlOut} 330 The results file is dynamically generated by the interface upon clicking the `Submit' button. This also executes checks, depending on the setup file, to ensure that all tracks have been played back, rated and commented on. The XML output returned contains a node per audioObject and contains both the corresponding marker's position and any comments written in the associated comment box. The rating returned is normalised to be a value between 0 and 1, normalising the pixel representation of different browser windows. An example output file is presented in Figure~\ref{fig:xmlOut}.
331 331
332 \begin{figure}[ht] 332 \begin{figure}[ht]
333 \begin{center} 333 \begin{center}
334 \includegraphics[width=0.5\textwidth]{XMLOutput2.png} 334 \includegraphics[width=0.5\textwidth]{XMLOutput2.png}
335 \caption{An Example Output XML File} 335 \caption{An example output XML file}
336 \label{fig:xmlOut} 336 \label{fig:xmlOut}
337 \end{center} 337 \end{center}
338 \end{figure} 338 \end{figure}
339 339
340 The results also contain information collected by any defined pre/post questions. These are referenced against the set up XML by using the same ID so readable responses can be obtained. Taking from the earlier example of setting up a pre-test question, an example response can be seen in Figure \ref{fig:xmlOut}. 340 The results also contain information collected by any defined pre/post questions. These are referenced against the setup XML by using the same ID so readable responses can be obtained. Taking from the earlier example of setting up a pre-test question, an example response can be seen in Figure \ref{fig:xmlOut}.
341 341
342 Each page of testing is returned with the results of the entire page included in the structure. One `audioElement' node is created per audio fragment per page, along with its ID. This includes several child nodes including the rating between 0 and 1, the comment, and any other collected metrics including how long the element was listened for, the initial position, boolean flags if the element was listened to, if the element was moved and if the element comment box had any comment. Furthermore, each user action (manipulation of any interface element, such as playback or moving a marker) can be logged along with a the corresponding time code. 342 Each page of testing is returned with the results of the entire page included in the structure. One `audioElement' node is created per audio fragment per page, along with its ID. This includes several child nodes including the rating between 0 and 1, the comment, and any other collected metrics including how long the element was listened for, the initial position, boolean flags if the element was listened to, if the element was moved and if the element comment box had any comment. Furthermore, each user action (manipulation of any interface element, such as playback or moving a marker) can be logged along with a the corresponding time code.
343 Furthermore, we also store session data such as the browser the tool was used in. 343 We also store session data such as the browser the tool was used in.
344 344
345 We provide the option to store the results locally, and/or to have them sent to a server. 345 We provide the option to store the results locally, and/or to have them sent to a server.
346 346
347 %Here is an example of the set up XML and the results XML: % perhaps best to refer to each XML after each section (set up <> results) 347 %Here is an example of the set up XML and the results XML: % perhaps best to refer to each XML after each section (set up <> results)
348 % Should we include an Example of the input and output XML structure?? --> Sure. 348 % Should we include an Example of the input and output XML structure?? --> Sure.
370 %<metricresult id="elementFlagListenedTo"> true< /metricresult> \\ 370 %<metricresult id="elementFlagListenedTo"> true< /metricresult> \\
371 %<metricresult id="elementFlagMoved"> true </metricresult> \\ 371 %<metricresult id="elementFlagMoved"> true </metricresult> \\
372 %</metric> \\ 372 %</metric> \\
373 %</audioelement>} 373 %</audioelement>}
374 374
375 The parent tag \texttt{audioelement} holds the ID of the element passed in from the set up document. The first child element is \texttt{comment} and holds both the question shown and the response from the comment box inside. 375 The parent tag \texttt{audioelement} holds the ID of the element passed in from the setup document. The first child element is \texttt{comment} and holds both the question shown and the response from the comment box inside.
376 The child element \texttt{value} holds the normalised ranking value. Next comes the metric node structure, there is one metric result node per metric event collected. The id of the node identifies the type of data it contains. For example, the first holds the id \textit{elementTimer} and the data contained represents how long, in seconds, the audio element was listened to. There is one \texttt{audioelement} tag per audio element on each test page. 376 The child element \texttt{value} holds the normalised ranking value. Next comes the metric node structure, with one metric result node per metric event collected. The id of the node identifies the type of data it contains. For example, the first holds the id \textit{elementTimer} and the data contained represents how long, in seconds, the audio element was listened to. There is one \texttt{audioelement} tag per audio element on each test page.
377 377
378 378
379 \section{Conclusions and future work}\label{sec:conclusions} 379 \section{Conclusions and future work}\label{sec:conclusions}
380 380
381 In this paper we have presented an approach to creating a browser-based listening test environment that can be used for a variety of types of perceptual evaluation of audio. 381 In this paper we have presented an approach to creating a browser-based listening test environment that can be used for a variety of types of perceptual evaluation of audio.
382 Specifically, we discussed the use of the toolbox in the context of assessment of preference for different production practices, with identical source material. 382 Specifically, we discussed the use of the toolbox in the context of assessment of preference for different production practices, with identical source material.
383 The purpose of this paper is to outline the design of this tool, to describe our implementation using basic HTML5 functionality, and to discuss design challenges and limitations of our approach. This tool differentiates itself from other perceptual audio tools by enabling web technologies for multiple participants to perform the test without the need for proprietary software such as MATLAB. The tool also allows for any interface to be built using HTML5 elements to create dynamic interfaces built either on current evaluation techniques or newer techniques. It enables quick set up of simple tests with the ability to manage complex tests through a single file. And finally it uses the XML document format to store the results allowing for processing and analysis of results in various third party software such as MATLAB or Python. 383 The purpose of this paper is to outline the design of this tool, to describe our implementation using basic HTML5 functionality, and to discuss design challenges and limitations of our approach. This tool differentiates itself from other perceptual audio tools by enabling web technologies for multiple participants to perform the test without the need for proprietary software such as MATLAB. The tool also allows for any interface to be built using HTML5 elements to create a variety of dynamic, multiple-stimulus listening test interfaces. It enables quick setup of simple tests with the ability to manage complex tests through a single file. Finally it uses the XML document format to store the results allowing for processing and analysis of results in various third party software such as MATLAB or Python.
384 384
385 % future work 385 % future work
386 Further work may include the development of other common test designs, such as MUSHRA \cite{mushra}, 2D valence and arousal rating, and others. We will add functionality to assist with setting up large-scale tests with remote subjects, so this becomes straightforward and intuitive. 386 Further work may include the development of other common test designs, such as MUSHRA \cite{mushra}, 2D valence and arousal rating, and others. We will add functionality to assist with setting up large-scale tests with remote subjects, so this becomes straightforward and intuitive.
387 In addition, we will keep on improving and expanding the tool, and highly welcome feedback and contributions from the community. 387 In addition, we will keep on improving and expanding the tool, and highly welcome feedback and contributions from the community.
388 388