Mercurial > hg > vamp-website

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <link rel="stylesheet" media="screen" type="text/css" href="/screen.css"/>
    <link rel="icon" type="image/png" href="/images/waveform.png"/>
    <link rel="shortcut" type="image/png" href="/images/waveform.png"/>
    <title>QM Vamp Plugins: User Documentation</title>
    <meta name="robots" content="index"/>
  </head>
  <body>
<h1 id="header"><span>Vamp Plugins</span></h1>

<h2>QM Vamp Plugins</h2>

<p>The QM Vamp Plugin set is a library of Vamp audio feature
extraction plugins developed at the <a
href="http://www.elec.qmul.ac.uk/digitalmusic/">Centre for Digital
Music</a> at Queen Mary, University of London.  These plugins are
provided as a single library file, made available in binary form for
Windows, OS/X, and Linux from the Centre for Digital Music's <a
href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">download
page</a>.
</p>
<p>For more information about Vamp plugins, see <a href="http://www.vamp-plugins.org/">http://www.vamp-plugins.org/</a> .
</p>

<div class="toc2">1. &nbsp;<a href="#qm-onsetdetector">Note Onset Detector</a></div>
<div class="toc2">2. &nbsp;<a href="#qm-tempotracker">Tempo and Beat Tracker</a></div>
<div class="toc2">3. &nbsp;<a href="#qm-keydetector">Key Detector</a></div>
<div class="toc2">4. &nbsp;<a href="#qm-tonalchange">Tonal Change</a></div>
<div class="toc2">5. &nbsp;<a href="#qm-segmenter">Segmenter</a></div>
<div class="toc2">6. &nbsp;<a href="#qm-similarity">Similarity</a></div>
<div class="toc2">7. &nbsp;<a href="#qm-constantq">Constant-Q Spectrogram</a></div>
<div class="toc2">8. &nbsp;<a href="#qm-chromagram">Chromagram</a></div>
<div class="toc2">9. &nbsp;<a href="#qm-mfcc">Mel-Frequency Cepstral Coefficients</a></div>

<a name="qm-onsetdetector"></a><a name="qm-"></a><h2>1. Note Onset Detector</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-onsetdetector</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-onsetdetector">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-onsetdetector</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Note Onset Detector analyses a single channel of audio and estimates
 the onset times of notes within the music &ndash; that is, the times at
 which notes and other audible events begin.
</p>
<p>It calculates an onset likelihood function for each spectral frame,
 and picks peaks in a smoothed version of this function.  The plugin is
 non-causal, returning all results at the end of processing.
</p>
<h3>Parameters</h3>

<p><b>Onset Detection Function Type</b> &ndash; The method used to calculate the
 onset likelihood function.  The most versatile method is the default,
 "Complex Domain" (see reference, Duxbury et al 2003).  "Spectral
 Difference" may be appropriate for percussive recordings, "Phase
 Deviation" for non-percussive music, and "Broadband Energy Rise" (see
 reference, Barry et al 2005) for identifying percussive onsets in
 mixed music.
</p>
<p><b>Onset Detector Sensitivity</b> &ndash; Sensitivity level for peak detection
 in the onset likelihood function.  The higher the sensitivity, the
 more onsets will (rightly or wrongly) be detected.  The peak picker
 does not have a simple threshold level; instead, this parameter
 controls the required "steepness" of the slopes in the smoothed
 detection function either side of a peak value, in order for that peak
 to be accepted as an onset.
</p>
<p><b>Adaptive Whitening</b> &ndash; This option evens out the temporal and
 frequency variation in the signal, which can yield improved
 performance in onset detection, for example in audio with big
 variations in dynamics.
</p>
<h3>Outputs</h3>

<p><b>Note Onsets</b> &ndash; The detected note onset times, returned as a single
 feature with timestamp but no value for each detected note.
</p>
<p><b>Onset Detection Function</b> &ndash; The raw note onset likelihood function
 that was calculated as the first step of the detection process.
</p>
<p><b>Smoothed Detection Function</b> &ndash; The note onset likelihood function
 following median filtering.  This is the function from which
 sufficiently steep peak values are picked and classified as onsets.
</p>
<h3>References and Credits</h3>

<p><b>Basic detection methods</b>: C. Duxbury, J. P. Bello, M. Davies and
 M. Sandler, <i><a href="http://www.elec.qmul.ac.uk/dafx03/proceedings/pdfs/dafx81.pdf">Complex domain Onset Detection for Musical Signals</a></i>. In
 Proceedings of the 6th Conference on Digital Audio Effects
 (DAFx-03). London, UK. September 2003.
</p>
<p><b>Adaptive whitening</b>: D. Stowell and M. D. Plumbley, <i><a href="http://www.elec.qmul.ac.uk/digitalmusic/papers/2007/StowellPlumbley07-icmc.pdf">Adaptive whitening for improved real-time audio onset detection</a></i>. In
 Proceedings of the International Computer Music Conference (ICMC'07),
 August 2007.
</p>
<p><b>Percussion onset detector</b>: D. Barry, D. Fitzgerald, E. Coyle and
 B. Lawlor, <i><a href="http://eleceng.dit.ie/papers/15.pdf">Drum Source Separation using Percussive Feature Detection and Spectral Modulation</a></i>. ISSC 2005.
</p>
<p>The Note Onset Detector Vamp plugin was written by Chris Duxbury, Juan
 Pablo Bello and Christian Landone.
</p>
<a name="qm-tempotracker"></a><h2>2. Tempo and Beat Tracker</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-tempotracker</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tempotracker">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tempotracker</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Tempo and Beat Tracker analyses a single channel of audio and
 estimates the positions of metrical beats within the music (the
 equivalent of a human listener tapping their foot to the beat).
</p>
<h3>Parameters</h3>

<p><b>Onset Detection Function Type</b> &ndash; The method used to calculate the
 onset likelihood function.  The most versatile method is the default,
 "Complex Domain" (see reference, Duxbury et al 2003).  "Spectral
 Difference" may be appropriate for percussive recordings, "Phase
 Deviation" for non-percussive music, and "Broadband Energy Rise" (see
 reference, Barry et al 2005) for identifying percussive onsets in
 mixed music.
</p>
<p><b>Adaptive Whitening</b> &ndash; This option evens out the temporal and
 frequency variation in the signal, which can yield improved
 performance in onset detection, for example in audio with big
 variations in dynamics.
</p>
<h3>Outputs</h3>

<p><b>Beats</b> &ndash; The estimated beat locations, returned as a single feature,
 with timestamp but no value, for each beat, labelled with the
 corresponding estimated tempo at that beat.
</p>
<p><b>Onset Detection Function</b> &ndash; The raw note onset likelihood function
 used in beat estimation.
</p>
<p><b>Tempo</b> &ndash; The estimated tempo, returned as a feature each time the
 estimated tempo changes, with a single value for the tempo in beats
 per minute.
</p>
<h3>References and Credits</h3>

<p><b>Beat tracking method</b>: M. E. P. Davies and M. D. Plumbley.
 <i><a href="http://www.elec.qmul.ac.uk/people/markp/2007/DaviesPlumbley07-taslp.pdf">Context-dependent beat tracking of musical audio</a></i>. In IEEE
 Transactions on Audio, Speech and Language Processing. Vol. 15, No. 3,
 pp1009-1020, 2007.  See also M. E. P. Davies and M. D. Plumbley.
 <i><a href="http://www.elec.qmul.ac.uk/people/markp/2005/DaviesPlumbley05-icassp.pdf">Beat Tracking With A Two State Model</a></i>. In Proceedings of the IEEE
 International Conference on Acoustics, Speech and Signal Processing
 (ICASSP 2005), Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005.
</p>
<p><b>Onset detection methods</b>: C. Duxbury, J. P. Bello, M. Davies and
 M. Sandler, <i><a href="http://www.elec.qmul.ac.uk/dafx03/proceedings/pdfs/dafx81.pdf">Complex domain Onset Detection for Musical Signals</a></i>. In
 Proceedings of the 6th Conference on Digital Audio Effects
 (DAFx-03). London, UK. September 2003.
</p>
<p><b>Adaptive whitening</b>: D. Stowell and M. D. Plumbley, <i><a href="http://www.elec.qmul.ac.uk/digitalmusic/papers/2007/StowellPlumbley07-icmc.pdf">Adaptive whitening for improved real-time audio onset detection</a></i>. In
 Proceedings of the International Computer Music Conference (ICMC'07),
 August 2007.
</p>
<p><b>Percussion onset detector</b>: D. Barry, D. Fitzgerald, E. Coyle and
 B. Lawlor, <i><a href="http://eleceng.dit.ie/papers/15.pdf">Drum Source Separation using Percussive Feature Detection and Spectral Modulation</a></i>. ISSC 2005.
</p>
<p>The Tempo and Beat Tracker Vamp plugin was written by Matthew Davies
 and Christian Landone.
</p>
<a name="qm-keydetector"></a><h2>3. Key Detector</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-keydetector</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-keydetector">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-keydetector</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Key Detector analyses a single channel of audio and continuously
 estimates the key of the music by comparing the degree to which a
 block-by-block chromagram correlates to the stored key profiles for
 each major and minor key.
</p>
<p>The key profiles are drawn from analysis of Book I of the Well
 Tempered Klavier by J S Bach, recorded at A=440 equal temperament.
</p>
<h3>Parameters</h3>

<p><b>Tuning Frequency</b> &ndash; The frequency of concert A in the music under
 analysis.
</p>
<p><b>Window Length</b> &ndash; The number of chroma analysis frames taken into
 account for key estimation.  This controls how eager the key detector
 will be to return short-duration tonal changes as new key changes (the
 shorter the window, the more likely it is to detect a new key change).
</p>
<h3>Outputs</h3>

<p><b>Tonic Pitch</b> &ndash; The tonic pitch of each estimated key change,
 returned as a single-valued feature at the point where the key change
 is detected, with value counted from 1 to 12 where C is 1, C# or Db is
 2, and so on up to B which is 12.
</p>
<p><b>Key Mode</b> &ndash; The major or minor mode of the estimated key, where
 major is 0 and minor is 1.
</p>
<p><b>Key</b> &ndash; The estimated key for each key change, returned as a
 single-valued feature at the point where the key change is detected,
 with value counted from 1 to 24 where 1-12 are the major keys and
 13-24 are the minor keys, such that C major is 1, C# major is 2, and
 so on up to B major which is 12; then C minor is 13, Db minor is 14,
 and so on up to B minor which is 24.
</p>
<p><b>Key Strength Plot</b> &ndash; A grid representing the ongoing key
 "probability" throughout the music.  This is returned as a feature for
 each chroma frame, containing 25 bins.  Bins 1-12 are the major keys
 from C upwards; bins 14-25 are the minor keys from C upwards.  The
 13th bin is unused: it just provides space between the first and
 second halves of the feature if displayed in a single plot.
</p>
<p>The outputs are also labelled with pitch or key as text.
</p>
<h3>References and Credits</h3>

<p><b>Method</b>: see K. Noland and M. Sandler. <i><a href="http://www.aes.org/e-lib/browse.cfm?elib=14140">Signal Processing Parameters for Tonality Estimation</a></i>. In Proceedings of Audio Engineering Society
 122nd Convention, Vienna, 2007.
</p>
<p>The Key Detector Vamp plugin was written by Katy Noland and Christian
 Landone.
</p>
<a name="qm-tonalchange"></a><h2>4. Tonal Change</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-tonalchange</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tonalchange">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tonalchange</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Tonal Change analyses a single channel of audio, detecting harmonic
 changes such as chord boundaries.
</p>
<h3>Parameters</h3>

<p><b>Gaussian smoothing</b> &ndash; The window length for the internal smoothing
 operation, in chroma analysis frames.  This controls how eager the
 tonal change detector will be to identify very short-term tonal
 changes.  The default value of 5 is quite short, and may lead to more
 (not always meaningful) results being returned; for many purposes a
 larger value, closer to the maximum of 20, may be appropriate.
</p>
<p><b>Chromagram minimum pitch</b> &ndash; The MIDI pitch value (0-127) of the
 minimum pitch included in the internal chromagram analyis.
</p>
<p><b>Chromagram maximum pitch</b> &ndash; The MIDI pitch value (0-127) of the
 maximum pitch included in the internal chromagram analyis.
</p>
<p><b>Chromagram tuning frequency</b> &ndash; The frequency of concert A in the
 music under analysis.
</p>
<h3>Outputs</h3>

<p><b>Transform to 6D Tonal Content Space</b> &ndash; A representation of the
 musical content in a six-dimensional tonal space onto which the
 algorithm maps 12-bin chroma vectors extracted from the audio.
</p>
<p><b>Tonal Change Detection Function</b> &ndash; A function representing the
 estimated likelihood of a tonal change occurring in each spectral
 frame.
</p>
<p><b>Tonal Change Positions</b> &ndash; The resulting estimated positions of tonal
 changes.
</p>
<h3>References and Credits</h3>

<p><b>Method</b>: C. A. Harte, M. Gasser, and M. Sandler. <i><a href="http://portal.acm.org/citation.cfm?id=1178723.1178727">Detecting harmonic change in musical audio</a></i>.  In Proceedings of the 1st ACM workshop on
 Audio and Music Computing Multimedia, Santa Barbara, 2006.
</p>
<p>The Tonal Change Vamp plugin was wrtitten by Chris Harte and Martin
 Gasser.
</p>
<a name="qm-segmenter"></a><h2>5. Segmenter</h2>

<p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-segmenter</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-segmenter">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-segmenter</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Segmenter divides a single channel of music up into structurally
 consistent segments.  It returns a numeric value (the segment type)
 for each moment at which a new segment starts.
</p>
<p>For music with clearly tonally distinguishable sections such as verse,
 chorus, etc., segments with the same type may be expected to be
 similar to one another in some structural sense.  For example,
 repetitions of the chorus are likely to share a segment type.
</p>
<p>The plugin only attempts to identify similar segments; it does not
 attempt to label them.  For example, it makes no attempt to tell you
 which segment is the chorus.
</p>
<p>Note that this plugin does a substantial amount of processing after
 receiving all of the input audio data, before it produces any results.
</p>
<h3>Method</h3>

<p>The method relies upon structural/timbral similarity to obtain the
 high-level song structure.  This is based on the assumption that the
 distributions of timbre features are similar over corresponding
 structural elements of the music.
</p>
<p>The algorithm works by obtaining a frequency-domain representation of
 the audio signal using a Constant-Q transform, a Chromagram or
 Mel-Frequency Cepstral Coefficients (MFCC) as underlying features (the
 particular feature is selectable as a parameter).  The extracted
 features are normalised in accordance with the MPEG-7 standard (NASE
 descriptor), which means the spectrum is converted to decibel scale
 and each spectral vector is normalised by the RMS energy envelope.
 The value of this envelope is stored for each processing block of
 audio. This is followed by the extraction of 20 principal components
 per block using PCA, yielding a sequence of 21 dimensional feature
 vectors where the last element in each vector corresponds to the
 energy envelope.
</p>
<p>A 40-state Hidden Markov Model is then trained on the whole sequence
 of features, with each state of the HMM corresponding to a specific
 timbre type. This process partitions the timbre-space of a given track
 into 40 possible types. The important assumption of the model is that
 the distribution of these features remain consistent over a structural
 segment. After training and decoding the HMM, the song is assigned a
 sequence of timbre-features according to specific timbre-type
 distributions for each possible structural segment.
</p>
<p>The segmentation itself is computed by clustering timbre-type
 histograms. A series of histograms are created over a sliding window
 which are grouped into M clusters by an adapted soft k-means
 algorithm. Each of these clusters will correspond to a specific
 segment-type of the analyzed song. Reference histograms, iteratively
 updated during clustering, describe the timbre distribution for each
 segment. The segmentation arises from the final cluster assignments.
</p>
<h3>Parameters</h3>

<p><b>Number of segment-types</b> &ndash; The maximum number of clusters
 (segment-types) to be returned.  The default is 10. Unlike many
 clustering algorithms, the constrained clustering used in this plugin
 does not produce too many clusters or vary significantly even if this
 is set too high. However, this parameter can be useful for limiting
 the number of expected segment-types.
</p>
<p><b>Feature Type</b> &ndash; The type of spectral feature used for segmentation.  The available features are:<ul><li>"Hybrid", the default, which uses a Constant-Q transform (see <a href="#qm-constantq">related
 plugin</a>): this is generally effective for modern studio recordings;</li><li> "Chromatic", using a chromagram derived from the Constant-Q feature (see <a href="#qm-chromagram">related plugin</a>): this may be preferable for live, acoustic, or older recordings, in which repeated sections may be less consistent in
 sound;</li><li>"Timbral", using Mel-Frequency
 Cepstral Coefficients (see <a href="#qm-mfcc">related plugin</a>), which is more likely to
 result in classification by instrumentation rather than musical
 content.</li></ul>
</p>
<p><b>Minimum segment duration</b> &ndash; The approximate expected minimum
 duration for a segment, from 1 to 15 seconds.  Changing this parameter
 may help the plugin to find musical sections rather than just
 following changes in the sound of the music, and also avoid wasting a
 segment-type cluster for timbrally distinct but too-short segments.
 The default of 4 seconds usually produces good results.
</p>
<h3>Outputs</h3>

<p><b>Segmentation</b> &ndash; The estimated segment boundaries, returned as a
 single feature with one value at each segment boundary, with the value
 representing the segment type number for the segment starting at that
 boundary.
</p>
<h3>References and Credits</h3>

<p><b>Method</b>: M. Levy and M. Sandler. <i><a href="http://ieeexplore.ieee.org/iel5/10376/4432632/04432648.pdf?arnumber=4432648">Structural segmentation of musical audio by constrained clustering</a></i>. IEEE Transactions on Audio, Speech, and Language Processing, February 2008.
</p>
<p>Note that this plugin does not implement the beat-sychronous aspect
 of the segmentation method described in the paper.
</p>
<p>The Segmenter Vamp plugin was written by Mark Levy.  Thanks to George
 Fazekas for providing much of this documentation.
</p>
<a name="qm-similarity"></a><h2>6. Similarity</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-similarity</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-similarity">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-similarity</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Similarity treats each channel of its audio input as a separate
 "track", and estimates how similar the tracks are to one another using
 a selectable similarity measure.
</p>
<p>The plugin also returns the intermediate data used as a basis of the
 similarity measure; it can therefore be used on a single channel of
 input (with the resulting intermediate data then being applied in some
 other similarity or clustering algorithm, for example) if desired, as
 well as with multiple inputs.
</p>
<p>Because of the way this plugin handles multiple inputs, by assuming
 that each channel represents a separate piece of music, it may not be
 appropriate for use directly in a general-purpose host (unless you
 actually want to do something like compare two stereo channels for
 timbral similarity, which is unlikely).
</p>
<h3>Parameters</h3>

<p><b>Feature Type</b> &ndash; The underlying audio feature used for the similarity
 measure.  The available features are:
<ul><li>"Timbre", in which the distance
 between tracks is a symmetrised Kullback-Leibler divergence between
 Gaussian-modelled MFCC means and variances across each track, for the
 first 20 MFCCs including C0 (see <a href="#qm-mfcc">related plugin</a>);</li><li>"Chroma", which uses Kullback-Leibler divergence of
 mean chroma histogram (see <a href="#qm-chromagram">related plugin</a>);</li><li>"Rhythm", using the cosine distance between
 "beat spectrum" measures derived from a short sampled section of the
 track;</li><li>and combined "Timbre and Rhythm" and "Chroma and Rhythm"
 features.</li></ul>
</p>
<h3>Outputs</h3>

<p><b>Distance Matrix</b> &ndash; A matrix of the distance measures between input
 channels, returned as a series of vector features timestamped at
 one-second intervals.  The distance from channel i to channel j
 appears as the j'th bin of the feature at time i.
</p>
<p><b>Distance from First Channel</b> &ndash; A single vector feature, timestamped
 at time zero, containing the distances between the first input channel
 and each of the input channels (including the first channel itself at
 bin 0, which should have zero distance).
</p>
<p><b>Ordered Distances from First Channel</b> &ndash; A pair of vector features,
 at times 0 and 1 second.  The feature at time 0 contains the 1-based
 indices of the input channels in the order of similarity to the first
 input channel (so its first bin should always contain 1, as the first
 channel is most similar to itself).  The feature at time 1 contains,
 in bin n, the distance between the first input channel and the channel
 with index found at bin n of the feature at time 0.
</p>
<p><b>Feature Means</b> &ndash; A series of vector features containing the mean
 values of each of the feature bins across the duration of each of the
 input channels.  This output returns one feature for each input
 channel, timestamped at one-second intervals.  The number of bins for
 each feature depends on the feature type; it will be 20 for MFCC
 features and 12 for chroma features.  No features will be returned on
 this output if the feature type is purely rhythmic.
</p>
<p><b>Feature Variances</b> &ndash; Just as Feature Means, but variances.
</p>
<p><b>Beat Spectra</b> &ndash; A series of vector features containing the rhythmic
 autocorrelation profiles (beat spectra) for each of the input
 channels.  This output returns one 512-bin feature for each input
 channel, timestamped at one-second intervals.  No features will be
 returned on this output if the feature type contains no rhythm
 component.
</p>
<h3>References and Credits</h3>

<p><b>Timbral similarity</b>: M. Levy and M. Sandler. <i><a href="http://www.elec.qmul.ac.uk/easaier/papers/mlevytimbralsimilarity.pdf">Lightweight measures for timbral similarity of musical audio</a></i>. In Proceedings of the 1st
 ACM workshop on Audio and Music Computing Multimedia, Santa Barbara,
 2006.
</p>
<p><b>Combined rhythmic and timbral similarity</b>: K. Jacobson. <i><a href="http://ismir2006.ismir.net/PAPERS/ISMIR0696_Paper.pdf">A Multifaceted Approach to Music Similarity</a></i>. In Proceedings of the
 Seventh International Conference on Music Information Retrieval
 (ISMIR), 2006.
</p>
<p>The Similarity Vamp plugin was written by Mark Levy, Kurt Jacobson and
 Chris Cannam.
</p>
<a name="qm-constantq"></a><h2>7. Constant-Q Spectrogram</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-constantq</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-constantq">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-constantq</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Constant-Q Spectrogram calculates a spectrogram based on a short-time
 windowed constant Q spectral transform.  This is a spectrogram in
 which the ratio of centre frequency to resolution is constant for each
 frequency bin.  The frequency bins correspond to the frequencies of
 "musical notes" rather than being linearly spaced in frequency as they
 are for the conventional DFT spectrogram.
</p>
<p>The pitch range and the number of frequency bins per octave may be
 adjusted using the plugin's parameters.  Note that the plugin's
 preferred step and block sizes are defined by these parameters, and
 the plugin will not accept any other block size than its preferred
 value.
</p>
<h3>Parameters</h3>

<p><b>Minimum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the lowest
 frequency to be included in the constant-Q transform.
</p>
<p><b>Maximum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the
 lowest frequency to be included in the constant-Q transform.
</p>
<p><b>Tuning Frequency</b> &ndash; The frequency of concert A in the
 music under analysis.
</p>
<p><b>Bins per Octave</b> &ndash; The number of constant-Q transform bins to be
 computed per octave.
</p>
<p><b>Normalized</b> &ndash; Whether to normalize each output column to unit
 maximum.
</p>
<h3>Outputs</h3>

<p><b>Constant-Q Spectrogram</b> &ndash; The calculated spectrogram, as a single
 feature per process block containing one bin for each pitch included
 in the spectrogram's range.
</p>
<h3>References and Credits</h3>

<p><b>Principle</b>: J. Brown. <i><a href="http://www.wellesley.edu/Physics/brown/pubs/cq1stPaper.pdf">Calculation of a constant Q spectral transform</a></i>. Journal of the Acoustical Society of America, 89(1):
 425-434, 1991.
</p>
<p>The Constant-Q Spectrogram Vamp plugin was written by Christian
 Landone.
</p>
<a name="qm-chromagram"></a><h2>8. Chromagram</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-chromagram</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-chromagram">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-chromagram</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Chromagram calculates a constant Q spectral transform (as in the
 Constant Q Spectrogram plugin) and then wraps the frequency bin values
 into a single octave, with each bin containing the sum of the
 magnitudes from the corresponding bin in all octaves.  The number of
 values in each feature vector returned by the plugin is therefore the
 same as the number of bins per octave configured for the underlying
 constant Q transform.
</p>
<p>The pitch range and the number of frequency bins per octave for the
 transform may be adjusted using the plugin's parameters.  Note that
 the plugin's preferred step and block sizes depend on these
 parameters, and the plugin will not accept any other block size than
 its preferred value.
</p>
<h3>Parameters</h3>

<p><b>Minimum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the
 lowest frequency to be included in the constant-Q transform used in
 calculating the chromagram.
</p>
<p><b>Maximum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the
 lowest frequency to be included in the constant-Q transform used in
 calculating the chromagram.
</p>
<p><b>Tuning Frequency</b> &ndash; The frequency of concert A in the
 music under analysis.
</p>
<p><b>Bins per Octave</b> &ndash; The number of constant-Q transform bins to be
 computed per octave, and thus the total number of bins present in the
 resulting chromagram.
</p>
<p><b>Normalized</b> &ndash; Whether to normalize each output column. Normalization
 may be to unit sum or unit maximum.
</p>
<h3>Outputs</h3>

<p><b>Chromagram</b> &ndash; The calculated chromagram, as a single feature per
 process block containing the number of bins given in the bins per
 octave parameter.
</p>
<h3>References and Credits</h3>

<p>The Chromagram Vamp plugin was written by Christian Landone.
</p>
<a name="qm-mfcc"></a><h2>9. Mel-Frequency Cepstral Coefficients</h2>

<p><b>System identifier</b> &ndash;    <code>vamp:qm-vamp-plugins:qm-mfcc</code>
<br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc</a>
<br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
</p>
<p>Mel-Frequency Cepstral Coefficients calculates MFCCs from a single
 channel of audio.  These coefficients, derived from a cosine transform
 of the mapping of an audio spectrum onto a frequency scale modelled on
 human auditory response, are widely used in speech recognition, music
 classification and other tasks.
</p>
<h3>Parameters</h3>

<p><b>Number of Coefficients</b> &ndash; The number of MFCCs to return.  Commonly
 used values include 13 or the default 20.  This number includes C0 if
 requested (see Include C0 below).
</p>
<p><b>Power for Mel Amplitude Logs</b> &ndash; An optional power value to which the
 spectral amplitudes should be raised before applying the cosine
 transform.  Values greater than 1 may in principle reduce the
 contribution of noise to the results.  The default is 1.
</p>
<p><b>Include C0</b> &ndash; Whether to include the "zero'th" coefficient, which
 simply reflects the overall signal power across the Mel frequency
 bands.
</p>
<h3>Outputs</h3>

<p><b>Coefficients</b> &ndash; The MFCC values, returned as one vector feature per
 processing block.
</p>
<p><b>Means of Coefficients</b> &ndash; The overall means of the MFCC bins, as a
 single vector feature with time 0 that is returned when processing is
 complete.
</p>
<h3>References and Credits</h3>

<p><b>MFCCs in music</b>: See B. Logan. <i><a href="http://ismir2000.ismir.net/papers/logan_paper.pdf">Mel-Frequency Cepstral Coefficients for Music Modeling</a></i>. In Proceedings of the First International
 Symposium on Music Information Retrieval (ISMIR), 2000.
</p>
<p>The Mel-Frequency Cepstral Coefficients Vamp plugin was written by
 Nicolas Chetry and Chris Cannam.
</p>
<p></p>
</CONTENTS>
</body>
</html>
author	cannam
date	Fri, 21 Nov 2008 11:41:45 +0000
parents
children	90a1fa18d239