Musicologists » History » Version 7

Mathieu Barthet, 2011-06-08 02:51 PM
new comment in personalisation of spectrogram measurements

1 1 Mathieu Barthet
h1. Musicologists
2 1 Mathieu Barthet
3 1 Mathieu Barthet
h2. Ethnographic observations at the British Library
4 1 Mathieu Barthet
5 1 Mathieu Barthet
This section presents several themes to adapt or improve Sonic Visualiser for musicological research purposes based on ethnographic observations carried out at the British Library from February to May 2011 by Mathieu Barthet.
6 1 Mathieu Barthet
7 1 Mathieu Barthet
*User Interface:*
8 1 Mathieu Barthet
9 1 Mathieu Barthet
+Observations:+
10 1 Mathieu Barthet
Musicologists alternate two listening practices: closed listening  (without visualisation), and  multimodal listening (with visualisation). Cross-modal effects between auditory and visual feedback occur. Due to this effect, it is deemed important to start with closed listening, and then use multimodal listening if necessary.
11 1 Mathieu Barthet
12 1 Mathieu Barthet
+Solutions:+
13 6 Mathieu Barthet
SV could allow a closed listening mode (without visualisation) and a multimodal listening mode (with visualisation). Closed listening could be associated with a basic mode (or skin), in the spirit of VLC, with basic playback functionalities (e.g. play/stop, navigation, volume, equalization). Multimodal listening could be associated with an advanced mode (or skin) offering visualisations (waveform, spectrogram) and more advanced functionality (e.g. Vamp plugin transform).
14 1 Mathieu Barthet
15 3 Chris Cannam
_(CC: I'm not very familiar with vlc, but I think I can picture what you're talking about. The waveform counts as a visualisation, I assume)_
16 1 Mathieu Barthet
17 6 Mathieu Barthet
* template solution:
18 6 Mathieu Barthet
19 6 Mathieu Barthet
One straightforward way to allow for closed listening (without visuals) would be to design a dedicated template only showing the main playback control buttons, the green waveform overview normally located at the bottom of the SV window (since it's small I don't think it would affect listening and it's useful for navigation), and the time-stretching and volume controllers. The thing is doing that the interface does not look very attractive though... 
20 6 Mathieu Barthet
21 6 Mathieu Barthet
Related issues:
22 6 Mathieu Barthet
- how to remove the space allowed for the pane in the main window?
23 6 Mathieu Barthet
- allow to remove some buttons from the toolbar when saving a template
24 6 Mathieu Barthet
- make the property box invisible (which is not the case when saving a template)
25 6 Mathieu Barthet
- bug reported when removing the waveform pane (app crashes), keeping only the waveform overview visible
26 6 Mathieu Barthet
27 6 Mathieu Barthet
One of the advantages of this solution is that it let users choose the default template they prefer. Some musicologists may prefer to start with the Closed listening template, others musicologists, or other types of users may not. 
28 6 Mathieu Barthet
29 6 Mathieu Barthet
* view modes solution:
30 6 Mathieu Barthet
31 6 Mathieu Barthet
One alternative would be to add to the View menu two view modes: Closed listening / Multimodal listening (or other terms) which would allow users to switch directly from one to the other. Toggling between the two modes could be triggered by a button as well (such as the "lozenge" one at the upper right corner on mac os x).
32 6 Mathieu Barthet
33 6 Mathieu Barthet
Related issues:
34 6 Mathieu Barthet
- Is there a way to design a lighter SV only acting as a player when starting and then obtain the full functionality when desired by modifying the View mode? Would the time the application takes to launch be smaller if it were to start in such simple mode, or all the Qt libraries need to be loaded anyway at launch?
35 6 Mathieu Barthet
36 6 Mathieu Barthet
The template solution seems to be the most straightforward.
37 6 Mathieu Barthet
38 1 Mathieu Barthet
*Automation/personalisation of spectrogram measurements:*
39 1 Mathieu Barthet
40 1 Mathieu Barthet
+Observations:+
41 1 Mathieu Barthet
The measurement tool provided by SV is used to measure time-frequency related variations, like the rate and extent of a vibrato. The process presents several drawbacks:
42 1 Mathieu Barthet
- it can be time-consuming (when performed on many different performances/notes), 
43 1 Mathieu Barthet
- it may not be systematic (the measurement tool is adjusted manually to match the amplitude variations of tones’ partial, the precision of the process depends on the level of magnification, eye sensitivity, and the visualisation settings chosen by the user).
44 3 Chris Cannam
45 1 Mathieu Barthet
+Solutions:+
46 1 Mathieu Barthet
- The measurement tool could be associated with a tone partial tracking functionality allowing to detect the amplitude variations of the partial in the selected area automatically. Descriptors about the amplitude variations of the partial could be computed (e.g. mean, variance, rate, extent, regularity). A solution would be to integrate some functionalities of Xue Wen’s Harmonic Visualiser into SV (including audio synthesis of selected partials). One of the difficulty would be to develop the framework allowing the interaction with the spectrogram in SV. Another possibility would be to design a tone partial tracking Vamp plugin that would take the parameters of the measurement rectangle as input.  _(CC: remember that Vamp plugins have a very unsophisticated notion of parameters, although you could potentially provide min+max frequency and supply only the audio region whose duration is beneath the rectangle)_
47 3 Chris Cannam
48 7 Mathieu Barthet
- Users should be able to save the settings used in spectrogram visualisation (color, scale, window, bins, and magnification) so that they can be further applied when analysing other audio files. This aspect can be managed by the use of customizable SV templates.  _(CC: review the "templating" branch of current SV repositories and see how far you think this can be helpful as it stands)_ _(MB: yes, templates make possible to save the spectrogram parameters.)_ 
49 1 Mathieu Barthet
50 1 Mathieu Barthet
*Audio feedback and sonification of metadata:*
51 1 Mathieu Barthet
52 1 Mathieu Barthet
+Observations:+
53 6 Mathieu Barthet
Users tried to click on the notes from the piano representation going along with the melodic spectrogram visualisation to listen to them.  _(CC: Should be a practical and useful addition)_ _(MB: What bit of code handles the display of the piano notes in the melodic range spectrogram? Is it handled by the SpectrogramLayer?)_
54 1 Mathieu Barthet
55 1 Mathieu Barthet
+Solutions:+
56 3 Chris Cannam
It would be useful to add a note audio feedback in the melodic spectrogram visualisation. To a wider extent, it would be useful to sonify the metadata extracted by Vamp plugins when relevant (e.g. chords).  _(CC: This is possible in some cases, e.g. the Chordino plugin has an output which produces a MIDI-style note representation of its chords and SV will play that)_
57 1 Mathieu Barthet
58 1 Mathieu Barthet
*Scores:*
59 1 Mathieu Barthet
60 1 Mathieu Barthet
+Observations:+
61 1 Mathieu Barthet
Musicologists often use scores while listening. They prefer to read the score on a page, not in chunks. They often use specific score editions which can be obtained as PDF scanned copies on online music sheet database (IMLSP). Visualisation of the performers’ expressive deviations from the score could enhance performance practice analysis.
62 1 Mathieu Barthet
63 3 Chris Cannam
+Solutions:+
64 1 Mathieu Barthet
- SV could provide an “Import score” functionality allowing to use several formats (symbolic like MIDI and MusicXML, and images, like PDF) [MIDI can already be imported as an annotation layer] _(CC: It's also possible to import individual images into an image layer: at one point I had planned to add PDF import into a series of images, perhaps using the Poppler PDF library, but I never produced any code for that)_. The UI should make possible to see score in a page mode on the screen (using e.g. a specific score template). Part of the code of Rose Garden from Chris Cannam may be used for that purpose (difficulty: Rose Garden is built on GTK and not Qt). _(CC: Not true, RG uses Qt4 like SV)_
65 3 Chris Cannam
66 1 Mathieu Barthet
- An Optical Music Recognition (OMR) engine could be embedded into SV (e.g. SharpEye) to convert PDF scores into machine-readable notations. The state of the art tools still offer poor performance for hand-produced scores (see the related post on the IMSLP forum: http://imslpforums.org/viewtopic.php?f=12&t=2805). _(CC: This sounds like overkill to me, given the usually large amount of manual post-processing that OMR requires)_
67 1 Mathieu Barthet
68 1 Mathieu Barthet
- A collaborative project with online music sheet database (e.g. MuseScore, IMSLP) could be set up to design dedicated API / SPARQL end-points to automatically retrieve scores within SV when these are available on the database (using the audio file metadata).
69 1 Mathieu Barthet
70 1 Mathieu Barthet
- Score visualisations could be further associated with audio-to-score and lyrics-to-audio alignment techniques.
71 1 Mathieu Barthet
72 1 Mathieu Barthet
- Assuming a reliable audio-to-score alignment technique, SV could provide various visualisations of performers’ expressive deviations from the score including timing, pitch, dynamics. The user should be given the possibility to integrate implicit rules of interpretation not written on the score in the expressive deviations visualisations (e.g. notes inégales pattern).
73 1 Mathieu Barthet
Acoustical features statistics could also be provided at various music time scales using information from the score: e.g. note-based level, phrase-based (requires additional rules), bar-based. 
74 1 Mathieu Barthet
75 1 Mathieu Barthet
*Text editor functionality:*
76 1 Mathieu Barthet
77 1 Mathieu Barthet
+Observations:+
78 1 Mathieu Barthet
Musicologists often write down notes while listening and proof read/enrich them during further listening. They often work with speech recordings and perform transcriptions. Switching between different devices (e.g. a CD player and the computer), or software (between the text editor and the audio player on a computer) can be time-consuming and irritating. Time localisation (e.g. tape counter) are often manually reported to connect the notes with a position in the recording.
79 1 Mathieu Barthet
80 4 Chris Cannam
+Solutions:+
81 6 Mathieu Barthet
SV should integrate a text editing pane allowing to write notes while listening. Users could be able to link the written notes with a localisation in time with the audio signal. The notes should be exportable in standard formats (e.g. RTF) to be shared and further modified in standard text editors. _(CC: This sounds straightforward enough, but care would be needed to limit any confusion or conflict with the existing text annotations layer which is designed for shorter and less free-form texts)_ _(MB: Then a solution would be to add a TextDocument layer to manage text documents)_. The control of the playback could be made easy using keyboard shortcuts (allowing to stay in the text editing pane while listening, rewinding, etc.). SV could provide the possibility to use a Midi footswitch pedal to control the playback. _(CC: That ought to be easy, SV already supports MIDI recording of notes as well as machine control of the transport via OSC; adding MMC as well should be straightforward)_
82 1 Mathieu Barthet
83 1 Mathieu Barthet
*Enhancing speech and music recordings analysis:*
84 1 Mathieu Barthet
85 1 Mathieu Barthet
+Observations:+
86 1 Mathieu Barthet
Musicologists often use broadcast recordings including both speech and music. They often have to transcribe interviews.
87 1 Mathieu Barthet
88 1 Mathieu Barthet
+Solutions:+
89 1 Mathieu Barthet
Content-based MIR techniques could be used to:
90 1 Mathieu Barthet
- facilitate the navigation between speech and music sections (development of a Vamp plugin for automatic speech/music segmentation);
91 1 Mathieu Barthet
- transcribe the interviews (development of a Vamp plugin for automatic speech recognition).
92 5 Chris Cannam
93 5 Chris Cannam
_(CC: No particular comments here except to note that one could easily also produce an excellent standalone transcription assistant program for a wider audience, using some of the text annotation and transport control logic referred to in the prior section)_
94 1 Mathieu Barthet
95 1 Mathieu Barthet
*Sound editor functionality:*
96 1 Mathieu Barthet
97 1 Mathieu Barthet
+Observations:+
98 1 Mathieu Barthet
Prior to analysing the music recordings, musicologists often have to digitize them (e.g. LP), or rip a CD. They also often need sound examples to illustrate talks or lectures.
99 1 Mathieu Barthet
100 1 Mathieu Barthet
+Solutions:+
101 6 Mathieu Barthet
SV could integrate some basic sound editor functionality like those proposed by Audacity (e.g. CD ripping, cut and paste, amplitude envelope modifications, record from the input audio channel).
102 5 Chris Cannam
103 5 Chris Cannam
_(CC: My view has always been that this is an endlessly deep and murky pit and that it's wise to leave audio editing to audio editors -- although it happens by accident that you can do some basic editing already just by selecting regions and exporting them as new audio files)_