Musicologists » History » Version 1
Ethnographic observations at the British Library¶
This section presents several themes to adapt or improve Sonic Visualiser for musicological research purposes based on ethnographic observations carried out at the British Library from February to May 2011 by Mathieu Barthet.
Musicologists alternate two listening practices: closed listening (without visualisation), and multimodal listening (with visualisation). Cross-modal effects between auditory and visual feedback occur. Due to this effect, it is deemed important to start with closed listening, and then use multimodal listening if necessary.
SV could allow a closed listening mode (without visualisation) and a multimodal listening mode (with visualisation). Closed listening could be associated with a basic mode (or skin), in the spirit of VLC, with basic playback functionalities (e.g. play/stop, navigation, volume, equalization). Multimodal listening could be associated with an advanced mode (or skin) offering visualisations (waveform, spectrogram) and more advanced functionalities (e.g. Vamp plugin transform).
Automation/personalisation of spectrogram measurements:
The measurement tool provided by SV is used to measure time-frequency related variations, like the rate and extent of a vibrato. The process presents several drawbacks:
- it can be time-consuming (when performed on many different performances/notes),
- it may not be systematic (the measurement tool is adjusted manually to match the amplitude variations of tones’ partial, the precision of the process depends on the level of magnification, eye sensitivity, and the visualisation settings chosen by the user).
- The measurement tool could be associated with a tone partial tracking functionality allowing to detect the amplitude variations of the partial in the selected area automatically. Descriptors about the amplitude variations of the partial could be computed (e.g. mean, variance, rate, extent, regularity). A solution would be to integrate some functionalities of Xue Wang’s Harmonic Visualiser into SV (including audio synthesis of selected partials). One of the difficulty would be to develop the framework allowing the interaction with the spectrogram in SV. Another possibility would be to design a tone partial tracking Vamp plugin that would take the parameters of the measurement rectangle as input.
- Users should be able to save the settings used in spectrogram visualisation (color, scale, window, bins, and magnification) so that they can be further applied when analysing other audio files. This aspect can be managed by the use of customizable SV templates.
Audio feedback and sonification of metadata:
Users tried to click on the notes from the piano representation going along with the melodic spectrogram visualisation to listen to them.
It would be useful to add a note audio feedback in the melodic spectrogram visualisation. To a wider extent, it would be useful to sonify the metadata extracted by Vamp plugins when relevant (e.g. chords).
Musicologists often use scores while listening. They prefer to read the score on a page, not in chunks. They often use specific score editions which can be obtained as PDF scanned copies on online music sheet database (IMLSP). Visualisation of the performers’ expressive deviations from the score could enhance performance practice analysis.
- SV could provide an “Import score” functionality allowing to use several formats (symbolic like MIDI and MusicXML, and images, like PDF) [MIDI can already be imported as an annotation layer]. The UI should make possible to see score in a page mode on the screen (using e.g. a specific score template). Part of the code of Rose Garden from Chris Cannam may be used for that purpose (difficulty: Rose Garden is built on GTK and not Qt).
- An Optical Music Recognition (OMR) engine could be embedded into SV (e.g. SharpEye) to convert PDF scores into machine-readable notations. The state of the art tools still offer poor performance for hand-produced scores (see the related post on the IMSLP forum: http://imslpforums.org/viewtopic.php?f=12&t=2805).
- A collaborative project with online music sheet database (e.g. MuseScore, IMSLP) could be set up to design dedicated API / SPARQL end-points to automatically retrieve scores within SV when these are available on the database (using the audio file metadata).
- Score visualisations could be further associated with audio-to-score and lyrics-to-audio alignment techniques.
- Assuming a reliable audio-to-score alignment technique, SV could provide various visualisations of performers’ expressive deviations from the score including timing, pitch, dynamics. The user should be given the possibility to integrate implicit rules of interpretation not written on the score in the expressive deviations visualisations (e.g. notes inégales pattern).
Acoustical features statistics could also be provided at various music time scales using information from the score: e.g. note-based level, phrase-based (requires additional rules), bar-based.
Text editor functionality:
Musicologists often write down notes while listening and proof read/enrich them during further listening. They often work with speech recordings and perform transcriptions. Switching between different devices (e.g. a CD player and the computer), or software (between the text editor and the audio player on a computer) can be time-consuming and irritating. Time localisation (e.g. tape counter) are often manually reported to connect the notes with a position in the recording.
SV should integrate a text editing pane allowing to write notes while listening. Users could be able to link the written notes with a localisation in time with the audio signal. The notes should be exportable in standard formats (e.g. RTF) to be shared and further modified in standard text editors. The control of the playback could be made easy using keyboard shortcuts (allowing to stay in the text editing pane while listening, rewinding, etc.). SV could provide the possibility to use a Midi footswitch pedal to control the playback.
Enhancing speech and music recordings analysis:
Musicologists often use broadcast recordings including both speech and music. They often have to transcribe interviews.
Content-based MIR techniques could be used to:
- facilitate the navigation between speech and music sections (development of a Vamp plugin for automatic speech/music segmentation);
- transcribe the interviews (development of a Vamp plugin for automatic speech recognition).
Sound editor functionality:
Prior to analysing the music recordings, musicologists often have to digitize them (e.g. LP), or rip a CD. They also often need sound examples to illustrate talks or lectures.
SV could integrate some basic sound editor functionalities like those proposed by Audacity (e.g. CD ripping, cut and paste, amplitude envelope modifications, record from the input audio channel).