annotate README @ 325:4cf4313d7e30 livemode

Always use q=0.8 and accept the hit on speed -- the templates are made for that configuration and it does work better. Also some adjustments to thresholding and peak picking for live mode in particular.
author Chris Cannam
date Mon, 18 May 2015 13:58:27 +0100
parents 8f48b65a6ef2
children a3fc6e1f2d4e
rev   line source
Chris@157 1
Chris@157 2
Chris@157 3 Silvet: Shift-Invariant Latent Variable Transcription
Chris@157 4 =====================================================
Chris@157 5
Chris@157 6 A polyphonic music transcription plugin.
Chris@157 7
Chris@157 8 http://code.soundsoftware.ac.uk/projects/silvet
Chris@157 9
Chris@157 10 Silvet is a Vamp plugin (http://vamp-plugins.org) for automatic music
Chris@157 11 transcription, using the method of "A Shift-Invariant Latent Variable
Chris@157 12 Model for Automatic Music Transcription" by Emmanouil Benetos and
Chris@157 13 Simon Dixon (see CITATION file).
Chris@157 14
Chris@157 15
Chris@157 16 What does it do?
Chris@157 17 ----------------
Chris@157 18
Chris@157 19 Silvet listens to audio recordings of music and tries to work out what
Chris@157 20 notes are being played.
Chris@157 21
Chris@157 22 To use Silvet, you need a Vamp plugin host such as Sonic Visualiser
Chris@157 23 (http://sonicvisualiser.org). How to use the plugin will depend on the
Chris@157 24 host, but in the case of Sonic Visualiser, you should load an audio
Chris@157 25 file and then run Silvet Note Transcription from the Transform
Chris@157 26 menu. This will add a note layer to your session with the
Chris@157 27 transcription in it, which you can play back or export as a MIDI file.
Chris@157 28
Chris@157 29
Chris@157 30 How good is it?
Chris@157 31 ---------------
Chris@157 32
Chris@212 33 It's reasonable for recordings that suit it: chamber music, solo
Chris@212 34 piano, acoustic jazz, etc. But the range of music that works well is
Chris@212 35 quite limited at this stage.
Chris@157 36
Chris@157 37 Silvet uses a probablistic latent-variable estimation method to
Chris@157 38 decompose a Constant-Q time-frequency matrix into note activations
Chris@157 39 using a set of spectral templates learned from recordings of solo
Chris@157 40 instruments. This means its performance is dominated by the
Chris@157 41 correspondence between its instrument templates and the sounds present
Chris@157 42 in the recording.
Chris@157 43
Chris@157 44 The method performs quite well (70-85% of notes identified correctly)
Chris@157 45 for clear recordings that contain only instruments with a good
Chris@157 46 correspondence to the known templates. In these cases its performance
Chris@157 47 becomes limited by the note decomposition step, clustering pitch
Chris@157 48 probabilities into note events, which is still fairly simplistic.
Chris@157 49
Chris@157 50 Silvet does not yet contain any vocal templates, or templates for
Chris@157 51 typical rock or electronic instruments. So it will usually perform
Chris@157 52 very poorly with pop and rock music, although the results can be
Chris@157 53 interesting anyway. Silvet also makes no attempt to transcribe
Chris@157 54 percussion.
Chris@157 55
Chris@157 56 For a formal evaluation, please refer to the 2012 edition of MIREX,
Chris@157 57 the Music Information Retrieval Evaluation Exchange, where the basic
Chris@157 58 method implemented in Silvet formed the BD1, BD2 and BD3 submissions
Chris@157 59 in the Multiple F0 Tracking task:
Chris@157 60
Chris@157 61 http://www.music-ir.org/mirex/wiki/2012:Multiple_Fundamental_Frequency_Estimation_%26_Tracking_Results
Chris@157 62
Chris@157 63
Chris@212 64 Authors
Chris@212 65 -------
Chris@157 66
Chris@212 67 The Silvet plugin code was adapted by Chris Cannam from research and a
Chris@212 68 MATLAB implementation by Emmanouil Benetos.
Chris@157 69
Chris@212 70
Chris@212 71 Citation, License and Use
Chris@212 72 -------------------------
Chris@157 73
Chris@157 74 If you make use of this software for any public or commercial purpose,
Chris@157 75 we ask you to kindly mention the authors and Queen Mary, University of
Chris@157 76 London in your user-visible documentation. We're very happy to see
Chris@157 77 this sort of use but would much appreciate being credited, independent
Chris@212 78 of the requirements of the software license itself (see below).
Chris@157 79
Chris@212 80 If you make use of this software for academic purposes, please cite:
Chris@212 81
Chris@212 82 Emmanouil Benetos and Simon Dixon, "A Shift-Invariant Latent
Chris@212 83 Variable Model for Automatic Music Transcription".
Chris@212 84 Computer Music Journal, volume 36 no 4, 2012, pp. 81-94.
Chris@212 85
Chris@212 86 (See the CITATION file for a BibTeX reference.)
Chris@212 87
Chris@212 88 This plugin is Copyright 2014 Queen Mary, University of London. It is
Chris@212 89 distributed under the GNU General Public License: see the file COPYING
Chris@212 90 for details.