On Displaying Musical Scores

Use cases

  1. Illustrating recordings by attaching a full publication-style score
    • ...where the score is not intended to be read closely, but rather used as a sort of key-frame reference for navigating the audio
    • ...or where user expects to be able to zoom the score far enough to actually read it properly
  2. Linear rather than page-based score that a user is expected to read, e.g. to play along with, or to check specific notes in a transcription or spectrogram
  3. On-the-fly score display of transcriptions and related annotation layers (score from MIDI)
  4. Score editing for correction (making usually small edits to substantial scores)
  5. Score editing for annotation (generating small scores from scratch)

Sources of score data

  • Score-encoding formats, e.g. MusicXML, MEI
  • PDFs of published scores
  • "Tidy" MIDI files and other cleaned-up annotation data (e.g. chord charts)
  • Untidy output from transcription methods and the like, in MIDI or MIDI-like formats

Possible implementations

  1. Integrate code from an existing application (most likely one that is also in C++ using Qt)
    • Suitable for all use cases, with limitation of requiring score to be available in a digital format
    • Has drawback of greatly adding to complexity of SV codebase, which is already quite complex
    • Only possibility if editing is required (but note that adding editing is likely to be very involved even with this approach, maybe prohibitively so)
    • MuseScore -- most obvious option
    • Rosegarden -- has a notation editor component but is primarily a sequencer (but does have the advantage that I wrote much of it and so know a lot of the code!)
  2. Invoke external application to render to PDF or sequence of images, then display PDF/image pages
    • Suitable for use case 1; unsuitable for use cases 2, 4, 5 (no way to do linear notation); uncertain for use case 3
    • Replaces problem of "integrating score display" with problem of "integrating PDF/image page display"
    • Could possibly draw from existing image layer display in SV
    • Could use MuseScore as the external application (it has a batch mode) or Lilypond, or something else like VexFlow, or more than one option depending on platform & input format
    • Has advantage of also introducing ability to display other PDF material than scores
    • Adds access to non-digital (scanned) score content

Both have their technical challenges, but there is also a conceptual question about how to align score pages or bars in the time axis (more on this below).

Which code to use for option 1?

I think the only reasonable options are MuseScore and Rosegarden. (Lilypond might be a faint possibility, but it's written in a combination of languages and doesn't natively render to Qt.)

MuseScore produces much better output than Rosegarden and has fairly well-organised code. Rosegarden has a more limited renderer and currently lacks MusicXML import. The only reason to consider anything other than MuseScore is that I'm quite familiar with the Rosegarden code myself already (and, I suppose, that it might be possible to contribute improvements to Rosegarden as well). At any rate, Rosegarden makes a useful baseline comparison.

Here are some examples, deliberately choosing small sizes in order to see how a zoomed-out scale looks, and considering both linear and paginated layouts.

Linear layouts with "good" source material

Both applications are displaying their own example files here, so we have two different pieces but both are pieces that are expected to work well.



Paginated layouts with "good" source material

Same pieces as above.



Imported from an approximate transcription

This is an input that is not expected to work well -- a transcription obtained from the Silvet Vamp plugin of a MAPS database recording of Chopin mazurka op 7 no 1, exported to MIDI. Shown below is the result of taking that MIDI file as exported by Sonic Visualiser and importing it into each application with the default quantization/tidying settings.

Despite their substantially different appearances, these are both displaying the same MIDI data. Besides quantization differences, they use different numbers of staffs and different clefs, keys, and time signatures. None of those are given in the MIDI file here and both programs contain logic to guess them. MuseScore has guessed the wrong key (according to the original score) but the right time signature. Rosegarden has the right key but the wrong time signature.



How much code is involved?

MuseScore's libmscore appears to contain the rendering code (not sure about file I/O). That is 136 .cpp files plus headers, 128K lines total including headers.

Rosegarden doesn't have a dedicated library structure. Its notation code is divided between base and gui/editor/notation directories, with MIDI file I/O elsewhere (a small fraction of the code in a sound/ directory). base is 53 .cpp files, gui/editors/notation is 42, combined they have 76K lines including headers.

Aligning score pages and bars in the time axis

There's a conceptual problem with how to dispose a score along the x-axis (time) when aligning with audio or audio feature material in Sonic Visualiser.

First you need a mapping between score position (bar number etc) and audio time. If the score is a representation of a SV note layer (MIDI output from a note transcription perhaps) then this might be intrinsic to the layer. If the audio is on the same timeline as the score (e.g. is an audio rendering from MIDI) then the mapping could be trivial. But more usually an imported score will have no obvious mapping. It will need to be set up manually, by the user e.g. dragging bits of the score around, or automatically, by performing audio-score alignment (or both, with the user correcting an automatic alignment). Scores displayed page-by-page presumably can cope with a coarser mapping than linear ones.

Given a mapping between score position and audio time, it's necessary to position the score elements in the right places.

With a page-by-page score this should be just a question of dropping in page images, except that they may need to be scaled or moved to avoid overlap (since SV zooms in the time axis usually independently of the vertical axis). The existing SV image layer does some of that scaling itself, not always successfully.

With a linear score, the job presumably is to ensure bar lines appear at the right places and stretch or squash bars to ensure that. Any renderer that can do page rendering should be able to handle stretching and squashing individual bars (it's necessary when justifying a system), though there's likely to be a limit on how much a bar can be squished. There may be some room for adapting how the score zoom responds to time-axis zoom (scaling the score in both axes vs changing only the horizontal spacing).

I haven't had any experience with systems that try to display score and audio in the same timeline, presumably there are some examples that could be learned from.

musescore-messy.png - MuseScore imports a very rough MIDI transcription 70.8 KB, downloaded 1034 times Chris Cannam, 2016-01-07 03:27 PM

musescore-tidy-linear.png - MuseScore displays one of its demo files in linear format 89.2 KB, downloaded 1040 times Chris Cannam, 2016-01-07 03:27 PM

rosegarden-messy.png - Rosegarden imports a very rough MIDI transcription 101 KB, downloaded 897 times Chris Cannam, 2016-01-07 03:27 PM

musescore-tidy-paginated.png - MuseScore displays one of its demo files in paginated format 368 KB, downloaded 1222 times Chris Cannam, 2016-01-07 03:27 PM

rosegarden-tidy-linear.png - Rosegarden displays one of its demo files in linear format 171 KB, downloaded 924 times Chris Cannam, 2016-01-07 03:27 PM

rosegarden-tidy-paginated.png - Rosegarden displays one of its demo files in paginated format 497 KB, downloaded 1004 times Chris Cannam, 2016-01-07 03:27 PM