c@63
|
1
|
c@63
|
2 QM Vamp Plugins
|
c@63
|
3 ===============
|
c@63
|
4
|
c@63
|
5 Vamp audio feature extraction plugins from Queen Mary, University of London.
|
c@63
|
6 Version 1.4.
|
c@63
|
7
|
c@63
|
8 For more information about Vamp plugins, see http://www.vamp-plugins.org/
|
c@63
|
9 and http://www.sonicvisualiser.org/ .
|
c@63
|
10
|
c@63
|
11
|
c@63
|
12 To Install
|
c@63
|
13 ==========
|
c@63
|
14
|
c@63
|
15 This package contains plugins for the Apple OS/X operating system,
|
c@63
|
16 compatible with both PPC and Intel hardware.
|
c@63
|
17
|
c@63
|
18 To install them, copy the files
|
c@63
|
19
|
c@63
|
20 qm-vamp-plugins.dylib and
|
c@63
|
21 qm-vamp-plugins.cat
|
c@63
|
22
|
c@63
|
23 to either
|
c@63
|
24
|
c@63
|
25 /Library/Audio/Plug-Ins/Vamp/ (for plugins available to all users) or
|
c@63
|
26 $HOME/Library/Audio/Plug-Ins/Vamp/ (for plugins available to you only).
|
c@63
|
27
|
c@63
|
28
|
c@63
|
29 License
|
c@63
|
30 =======
|
c@63
|
31
|
c@63
|
32 These plugins are provided in binary form only. You may install and
|
c@63
|
33 use the plugin binaries without fee for any purpose commercial or
|
c@63
|
34 non-commercial. You may redistribute the plugin binaries provided you
|
c@63
|
35 do so without fee and you retain this README file with your
|
c@63
|
36 distribution. You may not bundle these plugins with a commercial
|
c@63
|
37 product or distribute them on commercial terms. If you wish to
|
c@63
|
38 arrange commercial licensing terms, please contact the Centre for
|
c@63
|
39 Digital Music at Queen Mary, University of London.
|
c@63
|
40
|
c@63
|
41 Copyright (c) 2006-2008 Queen Mary, University of London. All rights
|
c@63
|
42 reserved except as described above.
|
c@63
|
43
|
c@63
|
44
|
c@63
|
45 New In This Release
|
c@63
|
46 ===================
|
c@63
|
47
|
c@63
|
48 This release contains a new plugin to estimate timbral and rhythmic
|
c@63
|
49 similarity between multiple audio tracks, a plugin for structural
|
c@63
|
50 segmentation of music audio, and a Mel-frequency cepstral coefficients
|
c@63
|
51 calculation plugin.
|
c@63
|
52
|
c@63
|
53 This release also includes significant updates to the existing key
|
c@63
|
54 detector, tempo tracker, and chromagram plugins.
|
c@63
|
55
|
c@63
|
56
|
c@63
|
57 Plugins Included
|
c@63
|
58 ================
|
c@63
|
59
|
c@63
|
60 This plugin set includes the following plugins:
|
c@63
|
61
|
c@63
|
62 * Note onset detector
|
c@63
|
63
|
c@63
|
64 * Beat tracker and tempo estimator
|
c@63
|
65
|
c@63
|
66 * Key estimator and tonal change detector
|
c@63
|
67
|
c@63
|
68 * Segmenter, to divide a track into a consistent sequence of segments
|
c@63
|
69
|
c@63
|
70 * Timbral and rhythmic similarity between audio tracks
|
c@63
|
71
|
c@63
|
72 * Chromagram, constant-Q spectrogram, and MFCC calculation plugins
|
c@63
|
73
|
c@63
|
74 More details about the plugins follow.
|
c@63
|
75
|
c@63
|
76
|
c@63
|
77 Note Onset Detector
|
c@63
|
78 -------------------
|
c@63
|
79
|
c@63
|
80 Identifier: qm-onsetdetector
|
c@63
|
81 Authors: Chris Duxbury, Juan Pablo Bello and Christian Landone
|
c@63
|
82 Category: Time > Onsets
|
c@63
|
83
|
c@63
|
84 References: C. Duxbury, J. P. Bello, M. Davies and M. Sandler.
|
c@63
|
85 Complex domain Onset Detection for Musical Signals.
|
c@63
|
86 In Proceedings of the 6th Conference on Digital Audio
|
c@63
|
87 Effects (DAFx-03). London, UK. September 2003.
|
c@63
|
88
|
c@63
|
89 D. Stowell and M. D. Plumbley.
|
c@63
|
90 Adaptive whitening for improved real-time audio onset
|
c@63
|
91 detection.
|
c@63
|
92 In Proceedings of the International Computer Music
|
c@63
|
93 Conference (ICMC'07), August 2007.
|
c@63
|
94
|
c@63
|
95 D. Barry, D. Fitzgerald, E. Coyle and B. Lawlor.
|
c@63
|
96 Drum Source Separation using Percussive Feature
|
c@63
|
97 Detection and Spectral Modulation.
|
c@63
|
98 ISSC 2005
|
c@63
|
99
|
c@63
|
100 The Note Onset Detector plugin analyses a single channel of audio and
|
c@63
|
101 estimates the locations of note onsets within the music.
|
c@63
|
102
|
c@63
|
103 It calculates an onset likelihood function for each spectral frame,
|
c@63
|
104 and picks peaks in a smoothed version of this function. The plugin is
|
c@63
|
105 non-causal, returning all results at the end of processing.
|
c@63
|
106
|
c@63
|
107 It has three outputs: the note onset positions, the onset detection
|
c@63
|
108 function used in estimating onset positions, and a smoothed version of
|
c@63
|
109 the detection function that is used in the peak-picking phase.
|
c@63
|
110
|
c@63
|
111
|
c@63
|
112 Tempo and Beat Tracker
|
c@63
|
113 ----------------------
|
c@63
|
114
|
c@63
|
115 Identifier: qm-tempotracker
|
c@63
|
116 Authors: Matthew Davies and Christian Landone
|
c@63
|
117 Category: Time > Tempo
|
c@63
|
118
|
c@63
|
119 References: M. E. P. Davies and M. D. Plumbley.
|
c@63
|
120 Context-dependent beat tracking of musical audio.
|
c@63
|
121 In IEEE Transactions on Audio, Speech and Language
|
c@63
|
122 Processing. Vol. 15, No. 3, pp1009-1020, 2007.
|
c@63
|
123
|
c@63
|
124 M. E. P. Davies and M. D. Plumbley.
|
c@63
|
125 Beat Tracking With A Two State Model.
|
c@63
|
126 In Proceedings of the IEEE International Conference
|
c@63
|
127 on Acoustics, Speech and Signal Processing (ICASSP 2005),
|
c@63
|
128 Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005.
|
c@63
|
129
|
c@63
|
130 The Tempo and Beat Tracker plugin analyses a single channel of audio
|
c@63
|
131 and estimates the locations of metrical beats and the resulting tempo.
|
c@63
|
132
|
c@63
|
133 It has three outputs: the beat positions, an ongoing estimate of tempo
|
c@63
|
134 where available, and the onset detection function used in estimating
|
c@63
|
135 beat positions.
|
c@63
|
136
|
c@63
|
137
|
c@63
|
138 Key Detector
|
c@63
|
139 ------------
|
c@63
|
140
|
c@63
|
141 Identifier: qm-keydetector
|
c@63
|
142 Authors: Katy Noland and Christian Landone
|
c@63
|
143 Category: Key and Tonality
|
c@63
|
144
|
c@63
|
145 References: K. Noland and M. Sandler.
|
c@63
|
146 Signal Processing Parameters for Tonality Estimation.
|
c@63
|
147 In Proceedings of Audio Engineering Society 122nd
|
c@63
|
148 Convention, Vienna, 2007.
|
c@63
|
149
|
c@63
|
150 The Key Detector plugin analyses a single channel of audio and
|
c@63
|
151 continuously estimates the key of the music.
|
c@63
|
152
|
c@63
|
153 It has four outputs: the tonic pitch of the key; a major or minor mode
|
c@63
|
154 flag; the key (combining the tonic and major/minor into a single
|
c@63
|
155 value); and a key strength plot which reports the degree to which the
|
c@63
|
156 chroma vector extracted from each input block correlates to the stored
|
c@63
|
157 key profiles for each major and minor key. The key profiles are drawn
|
c@63
|
158 from analysis of Book I of the Well Tempered Klavier by J S Bach,
|
c@63
|
159 recorded at A=440 equal temperament.
|
c@63
|
160
|
c@63
|
161 The outputs have the values:
|
c@63
|
162
|
c@63
|
163 Tonic pitch: C = 1, C#/Db = 2, ..., B = 12
|
c@63
|
164
|
c@63
|
165 Major/minor mode: major = 0, minor = 1
|
c@63
|
166
|
c@63
|
167 Key: C major = 1, C#/Db major = 2, ..., B major = 12
|
c@63
|
168 C minor = 13, C#/Db minor = 14, ..., B minor = 24
|
c@63
|
169
|
c@63
|
170 Key Strength Plot: 25 separate bins per feature, separated into 1-12
|
c@63
|
171 (major from C) and 14-25 (minor from C). Bin 13 is unused, not
|
c@63
|
172 for superstitious reasons but simply so as to delimit the major
|
c@63
|
173 and minor areas if they are displayed on a single plot by the
|
c@63
|
174 plugin host. Higher bin values show increased correlation with
|
c@63
|
175 the key profile for that key.
|
c@63
|
176
|
c@63
|
177 The outputs are also labelled with pitch or key as text.
|
c@63
|
178
|
c@63
|
179
|
c@63
|
180 Tonal Change
|
c@63
|
181 ------------
|
c@63
|
182
|
c@63
|
183 Identifier: qm-tonalchange
|
c@63
|
184 Authors: Chris Harte and Martin Gasser
|
c@63
|
185 Category: Key and Tonality
|
c@63
|
186
|
c@63
|
187 References: C. A. Harte, M. Gasser, and M. Sandler.
|
c@63
|
188 Detecting harmonic change in musical audio.
|
c@63
|
189 In Proceedings of the 1st ACM workshop on Audio and Music
|
c@63
|
190 Computing Multimedia, Santa Barbara, 2006.
|
c@63
|
191
|
c@63
|
192 C. A. Harte and M. Sandler.
|
c@63
|
193 Automatic chord identification using a quantised chromagram.
|
c@63
|
194 In Proceedings of the 118th Convention of the Audio
|
c@63
|
195 Engineering Society, Barcelona, Spain, May 28-31 2005.
|
c@63
|
196
|
c@63
|
197 The Tonal Change plugin analyses a single channel of audio, detecting
|
c@63
|
198 harmonic changes such as chord boundaries.
|
c@63
|
199
|
c@63
|
200 It has three outputs: a representation of the musical content in a
|
c@63
|
201 six-dimensional tonal space onto which the algorithm maps 12-bin
|
c@63
|
202 chroma vectors extracted from the audio; a function representing the
|
c@63
|
203 estimated likelihood of a tonal change occurring in each spectral
|
c@63
|
204 frame; and the resulting estimated positions of tonal changes.
|
c@63
|
205
|
c@63
|
206
|
c@63
|
207 Segmenter
|
c@63
|
208 ---------
|
c@63
|
209
|
c@63
|
210 Identifier: qm-segmenter
|
c@63
|
211 Authors: Mark Levy
|
c@63
|
212 Category: Classification
|
c@63
|
213
|
c@63
|
214 References: M. Levy and M. Sandler.
|
c@63
|
215 Structural segmentation of musical audio by constrained
|
c@63
|
216 clustering.
|
c@63
|
217 IEEE Transactions on Audio, Speech, and Language Processing,
|
c@63
|
218 February 2008.
|
c@63
|
219
|
c@63
|
220 The Segmenter plugin divides a single channel of music up into
|
c@63
|
221 structurally consistent segments. Its single output contains a
|
c@63
|
222 numeric value (the segment type) for each moment at which a new
|
c@63
|
223 segment starts.
|
c@63
|
224
|
c@63
|
225 For music with clearly tonally distinguishable sections such as verse,
|
c@63
|
226 chorus, etc., the segments with the same type may be expected to be
|
c@63
|
227 similar to one another in some structural sense (e.g. repetitions of
|
c@63
|
228 the chorus).
|
c@63
|
229
|
c@63
|
230 The type of feature used in segmentation can be selected using the
|
c@63
|
231 Feature Type parameter. The default Hybrid (Constant-Q) is generally
|
c@63
|
232 effective for modern studio recordings, while the Chromatic option may
|
c@63
|
233 be preferable for live, acoustic, or older recordings, in which
|
c@63
|
234 repeated sections may be less consistent in sound. Also available is
|
c@63
|
235 a timbral (MFCC) feature, which is more likely to result in
|
c@63
|
236 classification by instrumentation rather than musical content.
|
c@63
|
237
|
c@63
|
238 Note that this plugin does a substantial amount of processing after
|
c@63
|
239 receiving all of the input audio data, before it produces any results.
|
c@63
|
240
|
c@63
|
241
|
c@63
|
242 Similarity
|
c@63
|
243 ----------
|
c@63
|
244
|
c@63
|
245 Identifier: qm-similarity
|
c@63
|
246 Authors: Mark Levy, Kurt Jacobson and Chris Cannam
|
c@63
|
247 Category: Classification
|
c@63
|
248
|
c@63
|
249 References: M. Levy and M. Sandler.
|
c@63
|
250 Lightweight measures for timbral similarity of musical audio.
|
c@63
|
251 In Proceedings of the 1st ACM workshop on Audio and Music
|
c@63
|
252 Computing Multimedia, Santa Barbara, 2006.
|
c@63
|
253
|
c@63
|
254 K. Jacobson.
|
c@63
|
255 A Multifaceted Approach to Music Similarity.
|
c@63
|
256 In Proceedings of the Seventh International Conference on
|
c@63
|
257 Music Information Retrieval (ISMIR), 2006.
|
c@63
|
258
|
c@63
|
259 The Similarity plugin treats each channel of its audio input as a
|
c@63
|
260 separate "track", and estimates how similar the tracks are to one
|
c@63
|
261 another using a selectable similarity measure.
|
c@63
|
262
|
c@63
|
263 The plugin also returns the intermediate data used as a basis of the
|
c@63
|
264 similarity measure; it can therefore be used on a single channel of
|
c@63
|
265 input (with the resulting intermediate data then being applied in some
|
c@63
|
266 other similarity or clustering algorithm, for example) if desired, as
|
c@63
|
267 well as with multiple inputs.
|
c@63
|
268
|
c@63
|
269 The underlying audio features used for the similarity measure can be
|
c@63
|
270 selected using the Feature Type parameter. The available features are
|
c@63
|
271 Timbre (in which the distance between tracks is a symmetrised
|
c@63
|
272 Kullback-Leibler divergence between Gaussian-modelled MFCC means and
|
c@63
|
273 variances across each track); Chroma (KL divergence of mean chroma
|
c@63
|
274 histogram); Rhythm (cosine distance between "beat spectrum" measures
|
c@63
|
275 derived from a short sampled section of the track); and combined
|
c@63
|
276 "Timbre and Rhythm" and "Chroma and Rhythm".
|
c@63
|
277
|
c@63
|
278 The plugin has six outputs: a matrix of the distances between input
|
c@63
|
279 channels; a vector containing the distances between the first input
|
c@63
|
280 channel and each of the input channels; a pair of vectors containing
|
c@63
|
281 the indices of the input channels in the order of their similarity to
|
c@63
|
282 the first input channel, and the distances between the first input
|
c@63
|
283 channel and each of those channels; the means of the underlying
|
c@63
|
284 feature bins (MFCCs or chroma); the variances of the underlying
|
c@63
|
285 feature bins; and the beat spectra used for the rhythmic feature.
|
c@63
|
286
|
c@63
|
287 Because Vamp does not have the capability to return features in matrix
|
c@63
|
288 form explicitly, the matrix output is returned as a series of vector
|
c@63
|
289 features timestamped at one-second intervals. Likewise, the
|
c@63
|
290 underlying feature outputs contain one vector feature per input
|
c@63
|
291 channel, timestamped at one-second intervals (so the feature for the
|
c@63
|
292 first channel is at time 0, and so on). Examining the features that
|
c@63
|
293 the plugin actually returns, when run on some test data, may make this
|
c@63
|
294 arrangement more clear.
|
c@63
|
295
|
c@63
|
296 Note that the underlying feature values are only returned if the
|
c@63
|
297 relevant feature type is selected. That is, the means and variances
|
c@63
|
298 outputs are valid provided the pure rhythm feature is not selected;
|
c@63
|
299 the beat spectra output is valid provided rhythm is included in the
|
c@63
|
300 selected feature type.
|
c@63
|
301
|
c@63
|
302
|
c@63
|
303 Constant-Q Spectrogram
|
c@63
|
304 ----------------------
|
c@63
|
305
|
c@63
|
306 Identifier: qm-constantq
|
c@63
|
307 Authors: Christian Landone
|
c@63
|
308 Category: Visualisation
|
c@63
|
309
|
c@63
|
310 References: J. Brown.
|
c@63
|
311 Calculation of a constant Q spectral transform.
|
c@63
|
312 Journal of the Acoustical Society of America, 89(1):
|
c@63
|
313 425-434, 1991.
|
c@63
|
314
|
c@63
|
315 The Constant-Q Spectrogram plugin calculates a spectrogram based on a
|
c@63
|
316 short-time windowed constant Q spectral transform. This is a
|
c@63
|
317 spectrogram in which the ratio of centre frequency to resolution is
|
c@63
|
318 constant for each frequency bin. The frequency bins correspond to the
|
c@63
|
319 frequencies of "musical notes" rather than being linearly spaced in
|
c@63
|
320 frequency as they are for the conventional DFT spectrogram.
|
c@63
|
321
|
c@63
|
322 The pitch range and the number of frequency bins per octave may be
|
c@63
|
323 adjusted using the plugin's parameters. Note that the plugin's
|
c@63
|
324 preferred step and block sizes depend on these parameters, and the
|
c@63
|
325 plugin will not accept any other block size.
|
c@63
|
326
|
c@63
|
327
|
c@63
|
328 Chromagram
|
c@63
|
329 ----------
|
c@63
|
330
|
c@63
|
331 Identifier: qm-chromagram
|
c@63
|
332 Authors: Christian Landone
|
c@63
|
333 Category: Visualisation
|
c@63
|
334
|
c@63
|
335 The Chromagram plugin calculates a constant Q spectral transform (as
|
c@63
|
336 above) and then wraps the frequency bin values into a single octave,
|
c@63
|
337 with each bin containing the sum of the magnitudes from the
|
c@63
|
338 corresponding bin in all octaves. The number of values in each
|
c@63
|
339 feature vector returned by the plugin is therefore the same as the
|
c@63
|
340 number of bins per octave configured for the underlying constant Q
|
c@63
|
341 transform.
|
c@63
|
342
|
c@63
|
343 The pitch range and the number of frequency bins per octave for the
|
c@63
|
344 transform may be adjusted using the plugin's parameters. Note that
|
c@63
|
345 the plugin's preferred step and block sizes depend on these
|
c@63
|
346 parameters, and the plugin will not accept any other block size.
|
c@63
|
347
|
c@63
|
348
|
c@63
|
349 Mel-Frequency Cepstral Coefficients
|
c@63
|
350 -----------------------------------
|
c@63
|
351
|
c@63
|
352 Identifier: qm-mfcc
|
c@63
|
353 Authors: Nicolas Chetry and Chris Cannam
|
c@63
|
354 Category: Low Level Features
|
c@63
|
355
|
c@63
|
356 References: B. Logan.
|
c@63
|
357 Mel-Frequency Cepstral Coefficients for Music Modeling.
|
c@63
|
358 In Proceedings of the First International Symposium on Music
|
c@63
|
359 Information Retrieval (ISMIR), 2000.
|
c@63
|
360
|
c@63
|
361 The Mel-Frequency Cepstral Coefficients plugin calculates MFCCs from a
|
c@63
|
362 single channel of audio, returning one MFCC vector from each process
|
c@63
|
363 call. It also returns the overall means of the coefficient values
|
c@63
|
364 across the length of the audio input, as a separate output at the end
|
c@63
|
365 of processing.
|
c@63
|
366
|