comparison plugin-doc/qm-vamp-plugins.html @ 16:16f8de0dc974 website

* Add doc for QM plugins
author cannam
date Fri, 21 Nov 2008 11:41:45 +0000
parents
children 90a1fa18d239
comparison
equal deleted inserted replaced
15:c57ba57f33fa 16:16f8de0dc974
1 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2 <html>
3 <head>
4 <link rel="stylesheet" media="screen" type="text/css" href="/screen.css"/>
5 <link rel="icon" type="image/png" href="/images/waveform.png"/>
6 <link rel="shortcut" type="image/png" href="/images/waveform.png"/>
7 <title>QM Vamp Plugins: User Documentation</title>
8 <meta name="robots" content="index"/>
9 </head>
10 <body>
11 <h1 id="header"><span>Vamp Plugins</span></h1>
12
13 <h2>QM Vamp Plugins</h2>
14
15 <p>The QM Vamp Plugin set is a library of Vamp audio feature
16 extraction plugins developed at the <a
17 href="http://www.elec.qmul.ac.uk/digitalmusic/">Centre for Digital
18 Music</a> at Queen Mary, University of London. These plugins are
19 provided as a single library file, made available in binary form for
20 Windows, OS/X, and Linux from the Centre for Digital Music's <a
21 href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">download
22 page</a>.
23 </p>
24 <p>For more information about Vamp plugins, see <a href="http://www.vamp-plugins.org/">http://www.vamp-plugins.org/</a> .
25 </p>
26
27 <div class="toc2">1. &nbsp;<a href="#qm-onsetdetector">Note Onset Detector</a></div>
28 <div class="toc2">2. &nbsp;<a href="#qm-tempotracker">Tempo and Beat Tracker</a></div>
29 <div class="toc2">3. &nbsp;<a href="#qm-keydetector">Key Detector</a></div>
30 <div class="toc2">4. &nbsp;<a href="#qm-tonalchange">Tonal Change</a></div>
31 <div class="toc2">5. &nbsp;<a href="#qm-segmenter">Segmenter</a></div>
32 <div class="toc2">6. &nbsp;<a href="#qm-similarity">Similarity</a></div>
33 <div class="toc2">7. &nbsp;<a href="#qm-constantq">Constant-Q Spectrogram</a></div>
34 <div class="toc2">8. &nbsp;<a href="#qm-chromagram">Chromagram</a></div>
35 <div class="toc2">9. &nbsp;<a href="#qm-mfcc">Mel-Frequency Cepstral Coefficients</a></div>
36
37 <a name="qm-onsetdetector"></a><a name="qm-"></a><h2>1. Note Onset Detector</h2>
38
39 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-onsetdetector</code>
40 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-onsetdetector">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-onsetdetector</a>
41 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
42 </p>
43 <p>Note Onset Detector analyses a single channel of audio and estimates
44 the onset times of notes within the music &ndash; that is, the times at
45 which notes and other audible events begin.
46 </p>
47 <p>It calculates an onset likelihood function for each spectral frame,
48 and picks peaks in a smoothed version of this function. The plugin is
49 non-causal, returning all results at the end of processing.
50 </p>
51 <h3>Parameters</h3>
52
53 <p><b>Onset Detection Function Type</b> &ndash; The method used to calculate the
54 onset likelihood function. The most versatile method is the default,
55 "Complex Domain" (see reference, Duxbury et al 2003). "Spectral
56 Difference" may be appropriate for percussive recordings, "Phase
57 Deviation" for non-percussive music, and "Broadband Energy Rise" (see
58 reference, Barry et al 2005) for identifying percussive onsets in
59 mixed music.
60 </p>
61 <p><b>Onset Detector Sensitivity</b> &ndash; Sensitivity level for peak detection
62 in the onset likelihood function. The higher the sensitivity, the
63 more onsets will (rightly or wrongly) be detected. The peak picker
64 does not have a simple threshold level; instead, this parameter
65 controls the required "steepness" of the slopes in the smoothed
66 detection function either side of a peak value, in order for that peak
67 to be accepted as an onset.
68 </p>
69 <p><b>Adaptive Whitening</b> &ndash; This option evens out the temporal and
70 frequency variation in the signal, which can yield improved
71 performance in onset detection, for example in audio with big
72 variations in dynamics.
73 </p>
74 <h3>Outputs</h3>
75
76 <p><b>Note Onsets</b> &ndash; The detected note onset times, returned as a single
77 feature with timestamp but no value for each detected note.
78 </p>
79 <p><b>Onset Detection Function</b> &ndash; The raw note onset likelihood function
80 that was calculated as the first step of the detection process.
81 </p>
82 <p><b>Smoothed Detection Function</b> &ndash; The note onset likelihood function
83 following median filtering. This is the function from which
84 sufficiently steep peak values are picked and classified as onsets.
85 </p>
86 <h3>References and Credits</h3>
87
88 <p><b>Basic detection methods</b>: C. Duxbury, J. P. Bello, M. Davies and
89 M. Sandler, <i><a href="http://www.elec.qmul.ac.uk/dafx03/proceedings/pdfs/dafx81.pdf">Complex domain Onset Detection for Musical Signals</a></i>. In
90 Proceedings of the 6th Conference on Digital Audio Effects
91 (DAFx-03). London, UK. September 2003.
92 </p>
93 <p><b>Adaptive whitening</b>: D. Stowell and M. D. Plumbley, <i><a href="http://www.elec.qmul.ac.uk/digitalmusic/papers/2007/StowellPlumbley07-icmc.pdf">Adaptive whitening for improved real-time audio onset detection</a></i>. In
94 Proceedings of the International Computer Music Conference (ICMC'07),
95 August 2007.
96 </p>
97 <p><b>Percussion onset detector</b>: D. Barry, D. Fitzgerald, E. Coyle and
98 B. Lawlor, <i><a href="http://eleceng.dit.ie/papers/15.pdf">Drum Source Separation using Percussive Feature Detection and Spectral Modulation</a></i>. ISSC 2005.
99 </p>
100 <p>The Note Onset Detector Vamp plugin was written by Chris Duxbury, Juan
101 Pablo Bello and Christian Landone.
102 </p>
103 <a name="qm-tempotracker"></a><h2>2. Tempo and Beat Tracker</h2>
104
105 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-tempotracker</code>
106 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tempotracker">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tempotracker</a>
107 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
108 </p>
109 <p>Tempo and Beat Tracker analyses a single channel of audio and
110 estimates the positions of metrical beats within the music (the
111 equivalent of a human listener tapping their foot to the beat).
112 </p>
113 <h3>Parameters</h3>
114
115 <p><b>Onset Detection Function Type</b> &ndash; The method used to calculate the
116 onset likelihood function. The most versatile method is the default,
117 "Complex Domain" (see reference, Duxbury et al 2003). "Spectral
118 Difference" may be appropriate for percussive recordings, "Phase
119 Deviation" for non-percussive music, and "Broadband Energy Rise" (see
120 reference, Barry et al 2005) for identifying percussive onsets in
121 mixed music.
122 </p>
123 <p><b>Adaptive Whitening</b> &ndash; This option evens out the temporal and
124 frequency variation in the signal, which can yield improved
125 performance in onset detection, for example in audio with big
126 variations in dynamics.
127 </p>
128 <h3>Outputs</h3>
129
130 <p><b>Beats</b> &ndash; The estimated beat locations, returned as a single feature,
131 with timestamp but no value, for each beat, labelled with the
132 corresponding estimated tempo at that beat.
133 </p>
134 <p><b>Onset Detection Function</b> &ndash; The raw note onset likelihood function
135 used in beat estimation.
136 </p>
137 <p><b>Tempo</b> &ndash; The estimated tempo, returned as a feature each time the
138 estimated tempo changes, with a single value for the tempo in beats
139 per minute.
140 </p>
141 <h3>References and Credits</h3>
142
143 <p><b>Beat tracking method</b>: M. E. P. Davies and M. D. Plumbley.
144 <i><a href="http://www.elec.qmul.ac.uk/people/markp/2007/DaviesPlumbley07-taslp.pdf">Context-dependent beat tracking of musical audio</a></i>. In IEEE
145 Transactions on Audio, Speech and Language Processing. Vol. 15, No. 3,
146 pp1009-1020, 2007. See also M. E. P. Davies and M. D. Plumbley.
147 <i><a href="http://www.elec.qmul.ac.uk/people/markp/2005/DaviesPlumbley05-icassp.pdf">Beat Tracking With A Two State Model</a></i>. In Proceedings of the IEEE
148 International Conference on Acoustics, Speech and Signal Processing
149 (ICASSP 2005), Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005.
150 </p>
151 <p><b>Onset detection methods</b>: C. Duxbury, J. P. Bello, M. Davies and
152 M. Sandler, <i><a href="http://www.elec.qmul.ac.uk/dafx03/proceedings/pdfs/dafx81.pdf">Complex domain Onset Detection for Musical Signals</a></i>. In
153 Proceedings of the 6th Conference on Digital Audio Effects
154 (DAFx-03). London, UK. September 2003.
155 </p>
156 <p><b>Adaptive whitening</b>: D. Stowell and M. D. Plumbley, <i><a href="http://www.elec.qmul.ac.uk/digitalmusic/papers/2007/StowellPlumbley07-icmc.pdf">Adaptive whitening for improved real-time audio onset detection</a></i>. In
157 Proceedings of the International Computer Music Conference (ICMC'07),
158 August 2007.
159 </p>
160 <p><b>Percussion onset detector</b>: D. Barry, D. Fitzgerald, E. Coyle and
161 B. Lawlor, <i><a href="http://eleceng.dit.ie/papers/15.pdf">Drum Source Separation using Percussive Feature Detection and Spectral Modulation</a></i>. ISSC 2005.
162 </p>
163 <p>The Tempo and Beat Tracker Vamp plugin was written by Matthew Davies
164 and Christian Landone.
165 </p>
166 <a name="qm-keydetector"></a><h2>3. Key Detector</h2>
167
168 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-keydetector</code>
169 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-keydetector">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-keydetector</a>
170 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
171 </p>
172 <p>Key Detector analyses a single channel of audio and continuously
173 estimates the key of the music by comparing the degree to which a
174 block-by-block chromagram correlates to the stored key profiles for
175 each major and minor key.
176 </p>
177 <p>The key profiles are drawn from analysis of Book I of the Well
178 Tempered Klavier by J S Bach, recorded at A=440 equal temperament.
179 </p>
180 <h3>Parameters</h3>
181
182 <p><b>Tuning Frequency</b> &ndash; The frequency of concert A in the music under
183 analysis.
184 </p>
185 <p><b>Window Length</b> &ndash; The number of chroma analysis frames taken into
186 account for key estimation. This controls how eager the key detector
187 will be to return short-duration tonal changes as new key changes (the
188 shorter the window, the more likely it is to detect a new key change).
189 </p>
190 <h3>Outputs</h3>
191
192 <p><b>Tonic Pitch</b> &ndash; The tonic pitch of each estimated key change,
193 returned as a single-valued feature at the point where the key change
194 is detected, with value counted from 1 to 12 where C is 1, C# or Db is
195 2, and so on up to B which is 12.
196 </p>
197 <p><b>Key Mode</b> &ndash; The major or minor mode of the estimated key, where
198 major is 0 and minor is 1.
199 </p>
200 <p><b>Key</b> &ndash; The estimated key for each key change, returned as a
201 single-valued feature at the point where the key change is detected,
202 with value counted from 1 to 24 where 1-12 are the major keys and
203 13-24 are the minor keys, such that C major is 1, C# major is 2, and
204 so on up to B major which is 12; then C minor is 13, Db minor is 14,
205 and so on up to B minor which is 24.
206 </p>
207 <p><b>Key Strength Plot</b> &ndash; A grid representing the ongoing key
208 "probability" throughout the music. This is returned as a feature for
209 each chroma frame, containing 25 bins. Bins 1-12 are the major keys
210 from C upwards; bins 14-25 are the minor keys from C upwards. The
211 13th bin is unused: it just provides space between the first and
212 second halves of the feature if displayed in a single plot.
213 </p>
214 <p>The outputs are also labelled with pitch or key as text.
215 </p>
216 <h3>References and Credits</h3>
217
218 <p><b>Method</b>: see K. Noland and M. Sandler. <i><a href="http://www.aes.org/e-lib/browse.cfm?elib=14140">Signal Processing Parameters for Tonality Estimation</a></i>. In Proceedings of Audio Engineering Society
219 122nd Convention, Vienna, 2007.
220 </p>
221 <p>The Key Detector Vamp plugin was written by Katy Noland and Christian
222 Landone.
223 </p>
224 <a name="qm-tonalchange"></a><h2>4. Tonal Change</h2>
225
226 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-tonalchange</code>
227 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tonalchange">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-tonalchange</a>
228 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
229 </p>
230 <p>Tonal Change analyses a single channel of audio, detecting harmonic
231 changes such as chord boundaries.
232 </p>
233 <h3>Parameters</h3>
234
235 <p><b>Gaussian smoothing</b> &ndash; The window length for the internal smoothing
236 operation, in chroma analysis frames. This controls how eager the
237 tonal change detector will be to identify very short-term tonal
238 changes. The default value of 5 is quite short, and may lead to more
239 (not always meaningful) results being returned; for many purposes a
240 larger value, closer to the maximum of 20, may be appropriate.
241 </p>
242 <p><b>Chromagram minimum pitch</b> &ndash; The MIDI pitch value (0-127) of the
243 minimum pitch included in the internal chromagram analyis.
244 </p>
245 <p><b>Chromagram maximum pitch</b> &ndash; The MIDI pitch value (0-127) of the
246 maximum pitch included in the internal chromagram analyis.
247 </p>
248 <p><b>Chromagram tuning frequency</b> &ndash; The frequency of concert A in the
249 music under analysis.
250 </p>
251 <h3>Outputs</h3>
252
253 <p><b>Transform to 6D Tonal Content Space</b> &ndash; A representation of the
254 musical content in a six-dimensional tonal space onto which the
255 algorithm maps 12-bin chroma vectors extracted from the audio.
256 </p>
257 <p><b>Tonal Change Detection Function</b> &ndash; A function representing the
258 estimated likelihood of a tonal change occurring in each spectral
259 frame.
260 </p>
261 <p><b>Tonal Change Positions</b> &ndash; The resulting estimated positions of tonal
262 changes.
263 </p>
264 <h3>References and Credits</h3>
265
266 <p><b>Method</b>: C. A. Harte, M. Gasser, and M. Sandler. <i><a href="http://portal.acm.org/citation.cfm?id=1178723.1178727">Detecting harmonic change in musical audio</a></i>. In Proceedings of the 1st ACM workshop on
267 Audio and Music Computing Multimedia, Santa Barbara, 2006.
268 </p>
269 <p>The Tonal Change Vamp plugin was wrtitten by Chris Harte and Martin
270 Gasser.
271 </p>
272 <a name="qm-segmenter"></a><h2>5. Segmenter</h2>
273
274 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-segmenter</code>
275 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-segmenter">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-segmenter</a>
276 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
277 </p>
278 <p>Segmenter divides a single channel of music up into structurally
279 consistent segments. It returns a numeric value (the segment type)
280 for each moment at which a new segment starts.
281 </p>
282 <p>For music with clearly tonally distinguishable sections such as verse,
283 chorus, etc., segments with the same type may be expected to be
284 similar to one another in some structural sense. For example,
285 repetitions of the chorus are likely to share a segment type.
286 </p>
287 <p>The plugin only attempts to identify similar segments; it does not
288 attempt to label them. For example, it makes no attempt to tell you
289 which segment is the chorus.
290 </p>
291 <p>Note that this plugin does a substantial amount of processing after
292 receiving all of the input audio data, before it produces any results.
293 </p>
294 <h3>Method</h3>
295
296 <p>The method relies upon structural/timbral similarity to obtain the
297 high-level song structure. This is based on the assumption that the
298 distributions of timbre features are similar over corresponding
299 structural elements of the music.
300 </p>
301 <p>The algorithm works by obtaining a frequency-domain representation of
302 the audio signal using a Constant-Q transform, a Chromagram or
303 Mel-Frequency Cepstral Coefficients (MFCC) as underlying features (the
304 particular feature is selectable as a parameter). The extracted
305 features are normalised in accordance with the MPEG-7 standard (NASE
306 descriptor), which means the spectrum is converted to decibel scale
307 and each spectral vector is normalised by the RMS energy envelope.
308 The value of this envelope is stored for each processing block of
309 audio. This is followed by the extraction of 20 principal components
310 per block using PCA, yielding a sequence of 21 dimensional feature
311 vectors where the last element in each vector corresponds to the
312 energy envelope.
313 </p>
314 <p>A 40-state Hidden Markov Model is then trained on the whole sequence
315 of features, with each state of the HMM corresponding to a specific
316 timbre type. This process partitions the timbre-space of a given track
317 into 40 possible types. The important assumption of the model is that
318 the distribution of these features remain consistent over a structural
319 segment. After training and decoding the HMM, the song is assigned a
320 sequence of timbre-features according to specific timbre-type
321 distributions for each possible structural segment.
322 </p>
323 <p>The segmentation itself is computed by clustering timbre-type
324 histograms. A series of histograms are created over a sliding window
325 which are grouped into M clusters by an adapted soft k-means
326 algorithm. Each of these clusters will correspond to a specific
327 segment-type of the analyzed song. Reference histograms, iteratively
328 updated during clustering, describe the timbre distribution for each
329 segment. The segmentation arises from the final cluster assignments.
330 </p>
331 <h3>Parameters</h3>
332
333 <p><b>Number of segment-types</b> &ndash; The maximum number of clusters
334 (segment-types) to be returned. The default is 10. Unlike many
335 clustering algorithms, the constrained clustering used in this plugin
336 does not produce too many clusters or vary significantly even if this
337 is set too high. However, this parameter can be useful for limiting
338 the number of expected segment-types.
339 </p>
340 <p><b>Feature Type</b> &ndash; The type of spectral feature used for segmentation. The available features are:<ul><li>"Hybrid", the default, which uses a Constant-Q transform (see <a href="#qm-constantq">related
341 plugin</a>): this is generally effective for modern studio recordings;</li><li> "Chromatic", using a chromagram derived from the Constant-Q feature (see <a href="#qm-chromagram">related plugin</a>): this may be preferable for live, acoustic, or older recordings, in which repeated sections may be less consistent in
342 sound;</li><li>"Timbral", using Mel-Frequency
343 Cepstral Coefficients (see <a href="#qm-mfcc">related plugin</a>), which is more likely to
344 result in classification by instrumentation rather than musical
345 content.</li></ul>
346 </p>
347 <p><b>Minimum segment duration</b> &ndash; The approximate expected minimum
348 duration for a segment, from 1 to 15 seconds. Changing this parameter
349 may help the plugin to find musical sections rather than just
350 following changes in the sound of the music, and also avoid wasting a
351 segment-type cluster for timbrally distinct but too-short segments.
352 The default of 4 seconds usually produces good results.
353 </p>
354 <h3>Outputs</h3>
355
356 <p><b>Segmentation</b> &ndash; The estimated segment boundaries, returned as a
357 single feature with one value at each segment boundary, with the value
358 representing the segment type number for the segment starting at that
359 boundary.
360 </p>
361 <h3>References and Credits</h3>
362
363 <p><b>Method</b>: M. Levy and M. Sandler. <i><a href="http://ieeexplore.ieee.org/iel5/10376/4432632/04432648.pdf?arnumber=4432648">Structural segmentation of musical audio by constrained clustering</a></i>. IEEE Transactions on Audio, Speech, and Language Processing, February 2008.
364 </p>
365 <p>Note that this plugin does not implement the beat-sychronous aspect
366 of the segmentation method described in the paper.
367 </p>
368 <p>The Segmenter Vamp plugin was written by Mark Levy. Thanks to George
369 Fazekas for providing much of this documentation.
370 </p>
371 <a name="qm-similarity"></a><h2>6. Similarity</h2>
372
373 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-similarity</code>
374 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-similarity">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-similarity</a>
375 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
376 </p>
377 <p>Similarity treats each channel of its audio input as a separate
378 "track", and estimates how similar the tracks are to one another using
379 a selectable similarity measure.
380 </p>
381 <p>The plugin also returns the intermediate data used as a basis of the
382 similarity measure; it can therefore be used on a single channel of
383 input (with the resulting intermediate data then being applied in some
384 other similarity or clustering algorithm, for example) if desired, as
385 well as with multiple inputs.
386 </p>
387 <p>Because of the way this plugin handles multiple inputs, by assuming
388 that each channel represents a separate piece of music, it may not be
389 appropriate for use directly in a general-purpose host (unless you
390 actually want to do something like compare two stereo channels for
391 timbral similarity, which is unlikely).
392 </p>
393 <h3>Parameters</h3>
394
395 <p><b>Feature Type</b> &ndash; The underlying audio feature used for the similarity
396 measure. The available features are:
397 <ul><li>"Timbre", in which the distance
398 between tracks is a symmetrised Kullback-Leibler divergence between
399 Gaussian-modelled MFCC means and variances across each track, for the
400 first 20 MFCCs including C0 (see <a href="#qm-mfcc">related plugin</a>);</li><li>"Chroma", which uses Kullback-Leibler divergence of
401 mean chroma histogram (see <a href="#qm-chromagram">related plugin</a>);</li><li>"Rhythm", using the cosine distance between
402 "beat spectrum" measures derived from a short sampled section of the
403 track;</li><li>and combined "Timbre and Rhythm" and "Chroma and Rhythm"
404 features.</li></ul>
405 </p>
406 <h3>Outputs</h3>
407
408 <p><b>Distance Matrix</b> &ndash; A matrix of the distance measures between input
409 channels, returned as a series of vector features timestamped at
410 one-second intervals. The distance from channel i to channel j
411 appears as the j'th bin of the feature at time i.
412 </p>
413 <p><b>Distance from First Channel</b> &ndash; A single vector feature, timestamped
414 at time zero, containing the distances between the first input channel
415 and each of the input channels (including the first channel itself at
416 bin 0, which should have zero distance).
417 </p>
418 <p><b>Ordered Distances from First Channel</b> &ndash; A pair of vector features,
419 at times 0 and 1 second. The feature at time 0 contains the 1-based
420 indices of the input channels in the order of similarity to the first
421 input channel (so its first bin should always contain 1, as the first
422 channel is most similar to itself). The feature at time 1 contains,
423 in bin n, the distance between the first input channel and the channel
424 with index found at bin n of the feature at time 0.
425 </p>
426 <p><b>Feature Means</b> &ndash; A series of vector features containing the mean
427 values of each of the feature bins across the duration of each of the
428 input channels. This output returns one feature for each input
429 channel, timestamped at one-second intervals. The number of bins for
430 each feature depends on the feature type; it will be 20 for MFCC
431 features and 12 for chroma features. No features will be returned on
432 this output if the feature type is purely rhythmic.
433 </p>
434 <p><b>Feature Variances</b> &ndash; Just as Feature Means, but variances.
435 </p>
436 <p><b>Beat Spectra</b> &ndash; A series of vector features containing the rhythmic
437 autocorrelation profiles (beat spectra) for each of the input
438 channels. This output returns one 512-bin feature for each input
439 channel, timestamped at one-second intervals. No features will be
440 returned on this output if the feature type contains no rhythm
441 component.
442 </p>
443 <h3>References and Credits</h3>
444
445 <p><b>Timbral similarity</b>: M. Levy and M. Sandler. <i><a href="http://www.elec.qmul.ac.uk/easaier/papers/mlevytimbralsimilarity.pdf">Lightweight measures for timbral similarity of musical audio</a></i>. In Proceedings of the 1st
446 ACM workshop on Audio and Music Computing Multimedia, Santa Barbara,
447 2006.
448 </p>
449 <p><b>Combined rhythmic and timbral similarity</b>: K. Jacobson. <i><a href="http://ismir2006.ismir.net/PAPERS/ISMIR0696_Paper.pdf">A Multifaceted Approach to Music Similarity</a></i>. In Proceedings of the
450 Seventh International Conference on Music Information Retrieval
451 (ISMIR), 2006.
452 </p>
453 <p>The Similarity Vamp plugin was written by Mark Levy, Kurt Jacobson and
454 Chris Cannam.
455 </p>
456 <a name="qm-constantq"></a><h2>7. Constant-Q Spectrogram</h2>
457
458 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-constantq</code>
459 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-constantq">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-constantq</a>
460 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
461 </p>
462 <p>Constant-Q Spectrogram calculates a spectrogram based on a short-time
463 windowed constant Q spectral transform. This is a spectrogram in
464 which the ratio of centre frequency to resolution is constant for each
465 frequency bin. The frequency bins correspond to the frequencies of
466 "musical notes" rather than being linearly spaced in frequency as they
467 are for the conventional DFT spectrogram.
468 </p>
469 <p>The pitch range and the number of frequency bins per octave may be
470 adjusted using the plugin's parameters. Note that the plugin's
471 preferred step and block sizes are defined by these parameters, and
472 the plugin will not accept any other block size than its preferred
473 value.
474 </p>
475 <h3>Parameters</h3>
476
477 <p><b>Minimum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the lowest
478 frequency to be included in the constant-Q transform.
479 </p>
480 <p><b>Maximum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the
481 lowest frequency to be included in the constant-Q transform.
482 </p>
483 <p><b>Tuning Frequency</b> &ndash; The frequency of concert A in the
484 music under analysis.
485 </p>
486 <p><b>Bins per Octave</b> &ndash; The number of constant-Q transform bins to be
487 computed per octave.
488 </p>
489 <p><b>Normalized</b> &ndash; Whether to normalize each output column to unit
490 maximum.
491 </p>
492 <h3>Outputs</h3>
493
494 <p><b>Constant-Q Spectrogram</b> &ndash; The calculated spectrogram, as a single
495 feature per process block containing one bin for each pitch included
496 in the spectrogram's range.
497 </p>
498 <h3>References and Credits</h3>
499
500 <p><b>Principle</b>: J. Brown. <i><a href="http://www.wellesley.edu/Physics/brown/pubs/cq1stPaper.pdf">Calculation of a constant Q spectral transform</a></i>. Journal of the Acoustical Society of America, 89(1):
501 425-434, 1991.
502 </p>
503 <p>The Constant-Q Spectrogram Vamp plugin was written by Christian
504 Landone.
505 </p>
506 <a name="qm-chromagram"></a><h2>8. Chromagram</h2>
507
508 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-chromagram</code>
509 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-chromagram">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-chromagram</a>
510 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
511 </p>
512 <p>Chromagram calculates a constant Q spectral transform (as in the
513 Constant Q Spectrogram plugin) and then wraps the frequency bin values
514 into a single octave, with each bin containing the sum of the
515 magnitudes from the corresponding bin in all octaves. The number of
516 values in each feature vector returned by the plugin is therefore the
517 same as the number of bins per octave configured for the underlying
518 constant Q transform.
519 </p>
520 <p>The pitch range and the number of frequency bins per octave for the
521 transform may be adjusted using the plugin's parameters. Note that
522 the plugin's preferred step and block sizes depend on these
523 parameters, and the plugin will not accept any other block size than
524 its preferred value.
525 </p>
526 <h3>Parameters</h3>
527
528 <p><b>Minimum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the
529 lowest frequency to be included in the constant-Q transform used in
530 calculating the chromagram.
531 </p>
532 <p><b>Maximum Pitch</b> &ndash; The MIDI pitch value (0-127) corresponding to the
533 lowest frequency to be included in the constant-Q transform used in
534 calculating the chromagram.
535 </p>
536 <p><b>Tuning Frequency</b> &ndash; The frequency of concert A in the
537 music under analysis.
538 </p>
539 <p><b>Bins per Octave</b> &ndash; The number of constant-Q transform bins to be
540 computed per octave, and thus the total number of bins present in the
541 resulting chromagram.
542 </p>
543 <p><b>Normalized</b> &ndash; Whether to normalize each output column. Normalization
544 may be to unit sum or unit maximum.
545 </p>
546 <h3>Outputs</h3>
547
548 <p><b>Chromagram</b> &ndash; The calculated chromagram, as a single feature per
549 process block containing the number of bins given in the bins per
550 octave parameter.
551 </p>
552 <h3>References and Credits</h3>
553
554 <p>The Chromagram Vamp plugin was written by Christian Landone.
555 </p>
556 <a name="qm-mfcc"></a><h2>9. Mel-Frequency Cepstral Coefficients</h2>
557
558 <p><b>System identifier</b> &ndash; <code>vamp:qm-vamp-plugins:qm-mfcc</code>
559 <br><b>RDF URI</b> &ndash; <a href="http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc">http://vamp-plugins.org/rdf/plugins/qm-vamp-plugins#qm-mfcc</a>
560 <br><b>Links</b> &ndash; <a href="#">Back to top of library documentation</a> &ndash; <a href="http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins">Download location</a>
561 </p>
562 <p>Mel-Frequency Cepstral Coefficients calculates MFCCs from a single
563 channel of audio. These coefficients, derived from a cosine transform
564 of the mapping of an audio spectrum onto a frequency scale modelled on
565 human auditory response, are widely used in speech recognition, music
566 classification and other tasks.
567 </p>
568 <h3>Parameters</h3>
569
570 <p><b>Number of Coefficients</b> &ndash; The number of MFCCs to return. Commonly
571 used values include 13 or the default 20. This number includes C0 if
572 requested (see Include C0 below).
573 </p>
574 <p><b>Power for Mel Amplitude Logs</b> &ndash; An optional power value to which the
575 spectral amplitudes should be raised before applying the cosine
576 transform. Values greater than 1 may in principle reduce the
577 contribution of noise to the results. The default is 1.
578 </p>
579 <p><b>Include C0</b> &ndash; Whether to include the "zero'th" coefficient, which
580 simply reflects the overall signal power across the Mel frequency
581 bands.
582 </p>
583 <h3>Outputs</h3>
584
585 <p><b>Coefficients</b> &ndash; The MFCC values, returned as one vector feature per
586 processing block.
587 </p>
588 <p><b>Means of Coefficients</b> &ndash; The overall means of the MFCC bins, as a
589 single vector feature with time 0 that is returned when processing is
590 complete.
591 </p>
592 <h3>References and Credits</h3>
593
594 <p><b>MFCCs in music</b>: See B. Logan. <i><a href="http://ismir2000.ismir.net/papers/logan_paper.pdf">Mel-Frequency Cepstral Coefficients for Music Modeling</a></i>. In Proceedings of the First International
595 Symposium on Music Information Retrieval (ISMIR), 2000.
596 </p>
597 <p>The Mel-Frequency Cepstral Coefficients Vamp plugin was written by
598 Nicolas Chetry and Chris Cannam.
599 </p>
600 <p></p>
601 </CONTENTS>
602 </body>
603 </html>