Possibilities for Plugin Parameters

Two directions we could take:

  1. No parameters
  2. Enough parameters to be interesting

Fast and Slow modes

A problem with "no parameters" is that there is such a big difference in processing speed between different configurations. There's a strong case to be made for offering at least a choice between fast/draft mode and slow/thorough mode.

  • The most obvious difference would be that "fast mode" should suppress the 5-step shift factor.
  • We might also consider using a finer-grained time step in "slow mode". I think the current 40ms step results in audible jitter, though I may be wrong (possibly any timing imprecision results mostly from some other aspect of the method).
  • We could do more EM iterations in slow mode.
  • Our CQ transform goes down to 27.5 Hz, but the lowest note we ever return is over 60Hz. Fast mode could possibly drop the bottom CQ octave and just fill it with zeros.

Note threshold

Instrument restrictions

Presumably the method can run much faster if we are able to tell it that a piece has only one instrument in it. We might offer a dropdown of "all known instruments", "piano", "trombone" etc.

Apart from speeding up the plugin by reducing the number of templates to consider, this could also get better results by adding other templates (e.g. multiple versions of a single instrument -- if you only have piano, it could use all 3 piano templates, whereas for multiple instruments you don't necessarily want to take the time?) or adjusting other parameters (e.g. sparsity constraints for monophonic instruments).

We could separately have options for instruments to detect and instruments to return -- e.g. detect all instruments but return only the piano, or detect only the piano and return that, etc.

That sounds confusing. But the confusion may be intrinsic to the meaning of "which instrument I detect" -- if you tell the plugin to detect only piano, that doesn't mean it will separate out the piano from other instruments; you have to get it to detect all the other instruments too, if you want to be able to separate out the piano. Detecting only the piano basically means you're telling it there are no other instruments, as a pure performance optimisation.

It may be possible to phrase this so as to resolve some of the confusion. But it's also possible that the instrument identification isn't reliable enough to base a feature like this on anyway? I don't know, I haven't tested...

...and Possibilities for Plugin Outputs

Currently we have one main output

  • Note transcription, including pitch in Hz and velocity

and three "intermediate data" outputs

  • Raw constant-Q
  • Filtered constant-Q
  • Pitch activation matrix

May be worth remarking that as SV displays the outputs in alphabetical order, Note Transcription is not the first (filtered CQ is). That's bad, and suggests the filtered CQ output should be renamed or removed.

What else does the plugin know, that might be interesting?

  • Identity of predominant instrument, or of the estimated instrument for each note (not currently returned through the note transcription)
  • Approximate tuning, from the 5-step shift factor -- we currently return pitch in Hz but it's calculated only from the MIDI note, is there any merit in optionally returning a theoretically finer tuning?