qm-vamp-plugins: README.txt.osx annotate

annotate README.txt.osx @ 63:b084e87b83e4

* Add README files for the various platform packages * Fix typo in cat file * Return simpler key names from key detector * Chromagram and constant Q default to unnormalized * Permit up to 48 bpo in constant Q

author	Chris Cannam <c.cannam@qmul.ac.uk>
date	Thu, 07 Feb 2008 10:03:04 +0000
parents
children	e7c785094e7b

rev	line source
c@63	1
c@63	2 QM Vamp Plugins
c@63	3 ===============
c@63	4
c@63	5 Vamp audio feature extraction plugins from Queen Mary, University of London.
c@63	6 Version 1.4.
c@63	7
c@63	8 For more information about Vamp plugins, see http://www.vamp-plugins.org/
c@63	9 and http://www.sonicvisualiser.org/ .
c@63	10
c@63	11
c@63	12 To Install
c@63	13 ==========
c@63	14
c@63	15 This package contains plugins for the Apple OS/X operating system,
c@63	16 compatible with both PPC and Intel hardware.
c@63	17
c@63	18 To install them, copy the files
c@63	19
c@63	20 qm-vamp-plugins.dylib and
c@63	21 qm-vamp-plugins.cat
c@63	22
c@63	23 to either
c@63	24
c@63	25 /Library/Audio/Plug-Ins/Vamp/ (for plugins available to all users) or
c@63	26 $HOME/Library/Audio/Plug-Ins/Vamp/ (for plugins available to you only).
c@63	27
c@63	28
c@63	29 License
c@63	30 =======
c@63	31
c@63	32 These plugins are provided in binary form only. You may install and
c@63	33 use the plugin binaries without fee for any purpose commercial or
c@63	34 non-commercial. You may redistribute the plugin binaries provided you
c@63	35 do so without fee and you retain this README file with your
c@63	36 distribution. You may not bundle these plugins with a commercial
c@63	37 product or distribute them on commercial terms. If you wish to
c@63	38 arrange commercial licensing terms, please contact the Centre for
c@63	39 Digital Music at Queen Mary, University of London.
c@63	40
c@63	41 Copyright (c) 2006-2008 Queen Mary, University of London. All rights
c@63	42 reserved except as described above.
c@63	43
c@63	44
c@63	45 New In This Release
c@63	46 ===================
c@63	47
c@63	48 This release contains a new plugin to estimate timbral and rhythmic
c@63	49 similarity between multiple audio tracks, a plugin for structural
c@63	50 segmentation of music audio, and a Mel-frequency cepstral coefficients
c@63	51 calculation plugin.
c@63	52
c@63	53 This release also includes significant updates to the existing key
c@63	54 detector, tempo tracker, and chromagram plugins.
c@63	55
c@63	56
c@63	57 Plugins Included
c@63	58 ================
c@63	59
c@63	60 This plugin set includes the following plugins:
c@63	61
c@63	62 * Note onset detector
c@63	63
c@63	64 * Beat tracker and tempo estimator
c@63	65
c@63	66 * Key estimator and tonal change detector
c@63	67
c@63	68 * Segmenter, to divide a track into a consistent sequence of segments
c@63	69
c@63	70 * Timbral and rhythmic similarity between audio tracks
c@63	71
c@63	72 * Chromagram, constant-Q spectrogram, and MFCC calculation plugins
c@63	73
c@63	74 More details about the plugins follow.
c@63	75
c@63	76
c@63	77 Note Onset Detector
c@63	78 -------------------
c@63	79
c@63	80 Identifier: qm-onsetdetector
c@63	81 Authors: Chris Duxbury, Juan Pablo Bello and Christian Landone
c@63	82 Category: Time > Onsets
c@63	83
c@63	84 References: C. Duxbury, J. P. Bello, M. Davies and M. Sandler.
c@63	85 Complex domain Onset Detection for Musical Signals.
c@63	86 In Proceedings of the 6th Conference on Digital Audio
c@63	87 Effects (DAFx-03). London, UK. September 2003.
c@63	88
c@63	89 D. Stowell and M. D. Plumbley.
c@63	90 Adaptive whitening for improved real-time audio onset
c@63	91 detection.
c@63	92 In Proceedings of the International Computer Music
c@63	93 Conference (ICMC'07), August 2007.
c@63	94
c@63	95 D. Barry, D. Fitzgerald, E. Coyle and B. Lawlor.
c@63	96 Drum Source Separation using Percussive Feature
c@63	97 Detection and Spectral Modulation.
c@63	98 ISSC 2005
c@63	99
c@63	100 The Note Onset Detector plugin analyses a single channel of audio and
c@63	101 estimates the locations of note onsets within the music.
c@63	102
c@63	103 It calculates an onset likelihood function for each spectral frame,
c@63	104 and picks peaks in a smoothed version of this function. The plugin is
c@63	105 non-causal, returning all results at the end of processing.
c@63	106
c@63	107 It has three outputs: the note onset positions, the onset detection
c@63	108 function used in estimating onset positions, and a smoothed version of
c@63	109 the detection function that is used in the peak-picking phase.
c@63	110
c@63	111
c@63	112 Tempo and Beat Tracker
c@63	113 ----------------------
c@63	114
c@63	115 Identifier: qm-tempotracker
c@63	116 Authors: Matthew Davies and Christian Landone
c@63	117 Category: Time > Tempo
c@63	118
c@63	119 References: M. E. P. Davies and M. D. Plumbley.
c@63	120 Context-dependent beat tracking of musical audio.
c@63	121 In IEEE Transactions on Audio, Speech and Language
c@63	122 Processing. Vol. 15, No. 3, pp1009-1020, 2007.
c@63	123
c@63	124 M. E. P. Davies and M. D. Plumbley.
c@63	125 Beat Tracking With A Two State Model.
c@63	126 In Proceedings of the IEEE International Conference
c@63	127 on Acoustics, Speech and Signal Processing (ICASSP 2005),
c@63	128 Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005.
c@63	129
c@63	130 The Tempo and Beat Tracker plugin analyses a single channel of audio
c@63	131 and estimates the locations of metrical beats and the resulting tempo.
c@63	132
c@63	133 It has three outputs: the beat positions, an ongoing estimate of tempo
c@63	134 where available, and the onset detection function used in estimating
c@63	135 beat positions.
c@63	136
c@63	137
c@63	138 Key Detector
c@63	139 ------------
c@63	140
c@63	141 Identifier: qm-keydetector
c@63	142 Authors: Katy Noland and Christian Landone
c@63	143 Category: Key and Tonality
c@63	144
c@63	145 References: K. Noland and M. Sandler.
c@63	146 Signal Processing Parameters for Tonality Estimation.
c@63	147 In Proceedings of Audio Engineering Society 122nd
c@63	148 Convention, Vienna, 2007.
c@63	149
c@63	150 The Key Detector plugin analyses a single channel of audio and
c@63	151 continuously estimates the key of the music.
c@63	152
c@63	153 It has four outputs: the tonic pitch of the key; a major or minor mode
c@63	154 flag; the key (combining the tonic and major/minor into a single
c@63	155 value); and a key strength plot which reports the degree to which the
c@63	156 chroma vector extracted from each input block correlates to the stored
c@63	157 key profiles for each major and minor key. The key profiles are drawn
c@63	158 from analysis of Book I of the Well Tempered Klavier by J S Bach,
c@63	159 recorded at A=440 equal temperament.
c@63	160
c@63	161 The outputs have the values:
c@63	162
c@63	163 Tonic pitch: C = 1, C#/Db = 2, ..., B = 12
c@63	164
c@63	165 Major/minor mode: major = 0, minor = 1
c@63	166
c@63	167 Key: C major = 1, C#/Db major = 2, ..., B major = 12
c@63	168 C minor = 13, C#/Db minor = 14, ..., B minor = 24
c@63	169
c@63	170 Key Strength Plot: 25 separate bins per feature, separated into 1-12
c@63	171 (major from C) and 14-25 (minor from C). Bin 13 is unused, not
c@63	172 for superstitious reasons but simply so as to delimit the major
c@63	173 and minor areas if they are displayed on a single plot by the
c@63	174 plugin host. Higher bin values show increased correlation with
c@63	175 the key profile for that key.
c@63	176
c@63	177 The outputs are also labelled with pitch or key as text.
c@63	178
c@63	179
c@63	180 Tonal Change
c@63	181 ------------
c@63	182
c@63	183 Identifier: qm-tonalchange
c@63	184 Authors: Chris Harte and Martin Gasser
c@63	185 Category: Key and Tonality
c@63	186
c@63	187 References: C. A. Harte, M. Gasser, and M. Sandler.
c@63	188 Detecting harmonic change in musical audio.
c@63	189 In Proceedings of the 1st ACM workshop on Audio and Music
c@63	190 Computing Multimedia, Santa Barbara, 2006.
c@63	191
c@63	192 C. A. Harte and M. Sandler.
c@63	193 Automatic chord identification using a quantised chromagram.
c@63	194 In Proceedings of the 118th Convention of the Audio
c@63	195 Engineering Society, Barcelona, Spain, May 28-31 2005.
c@63	196
c@63	197 The Tonal Change plugin analyses a single channel of audio, detecting
c@63	198 harmonic changes such as chord boundaries.
c@63	199
c@63	200 It has three outputs: a representation of the musical content in a
c@63	201 six-dimensional tonal space onto which the algorithm maps 12-bin
c@63	202 chroma vectors extracted from the audio; a function representing the
c@63	203 estimated likelihood of a tonal change occurring in each spectral
c@63	204 frame; and the resulting estimated positions of tonal changes.
c@63	205
c@63	206
c@63	207 Segmenter
c@63	208 ---------
c@63	209
c@63	210 Identifier: qm-segmenter
c@63	211 Authors: Mark Levy
c@63	212 Category: Classification
c@63	213
c@63	214 References: M. Levy and M. Sandler.
c@63	215 Structural segmentation of musical audio by constrained
c@63	216 clustering.
c@63	217 IEEE Transactions on Audio, Speech, and Language Processing,
c@63	218 February 2008.
c@63	219
c@63	220 The Segmenter plugin divides a single channel of music up into
c@63	221 structurally consistent segments. Its single output contains a
c@63	222 numeric value (the segment type) for each moment at which a new
c@63	223 segment starts.
c@63	224
c@63	225 For music with clearly tonally distinguishable sections such as verse,
c@63	226 chorus, etc., the segments with the same type may be expected to be
c@63	227 similar to one another in some structural sense (e.g. repetitions of
c@63	228 the chorus).
c@63	229
c@63	230 The type of feature used in segmentation can be selected using the
c@63	231 Feature Type parameter. The default Hybrid (Constant-Q) is generally
c@63	232 effective for modern studio recordings, while the Chromatic option may
c@63	233 be preferable for live, acoustic, or older recordings, in which
c@63	234 repeated sections may be less consistent in sound. Also available is
c@63	235 a timbral (MFCC) feature, which is more likely to result in
c@63	236 classification by instrumentation rather than musical content.
c@63	237
c@63	238 Note that this plugin does a substantial amount of processing after
c@63	239 receiving all of the input audio data, before it produces any results.
c@63	240
c@63	241
c@63	242 Similarity
c@63	243 ----------
c@63	244
c@63	245 Identifier: qm-similarity
c@63	246 Authors: Mark Levy, Kurt Jacobson and Chris Cannam
c@63	247 Category: Classification
c@63	248
c@63	249 References: M. Levy and M. Sandler.
c@63	250 Lightweight measures for timbral similarity of musical audio.
c@63	251 In Proceedings of the 1st ACM workshop on Audio and Music
c@63	252 Computing Multimedia, Santa Barbara, 2006.
c@63	253
c@63	254 K. Jacobson.
c@63	255 A Multifaceted Approach to Music Similarity.
c@63	256 In Proceedings of the Seventh International Conference on
c@63	257 Music Information Retrieval (ISMIR), 2006.
c@63	258
c@63	259 The Similarity plugin treats each channel of its audio input as a
c@63	260 separate "track", and estimates how similar the tracks are to one
c@63	261 another using a selectable similarity measure.
c@63	262
c@63	263 The plugin also returns the intermediate data used as a basis of the
c@63	264 similarity measure; it can therefore be used on a single channel of
c@63	265 input (with the resulting intermediate data then being applied in some
c@63	266 other similarity or clustering algorithm, for example) if desired, as
c@63	267 well as with multiple inputs.
c@63	268
c@63	269 The underlying audio features used for the similarity measure can be
c@63	270 selected using the Feature Type parameter. The available features are
c@63	271 Timbre (in which the distance between tracks is a symmetrised
c@63	272 Kullback-Leibler divergence between Gaussian-modelled MFCC means and
c@63	273 variances across each track); Chroma (KL divergence of mean chroma
c@63	274 histogram); Rhythm (cosine distance between "beat spectrum" measures
c@63	275 derived from a short sampled section of the track); and combined
c@63	276 "Timbre and Rhythm" and "Chroma and Rhythm".
c@63	277
c@63	278 The plugin has six outputs: a matrix of the distances between input
c@63	279 channels; a vector containing the distances between the first input
c@63	280 channel and each of the input channels; a pair of vectors containing
c@63	281 the indices of the input channels in the order of their similarity to
c@63	282 the first input channel, and the distances between the first input
c@63	283 channel and each of those channels; the means of the underlying
c@63	284 feature bins (MFCCs or chroma); the variances of the underlying
c@63	285 feature bins; and the beat spectra used for the rhythmic feature.
c@63	286
c@63	287 Because Vamp does not have the capability to return features in matrix
c@63	288 form explicitly, the matrix output is returned as a series of vector
c@63	289 features timestamped at one-second intervals. Likewise, the
c@63	290 underlying feature outputs contain one vector feature per input
c@63	291 channel, timestamped at one-second intervals (so the feature for the
c@63	292 first channel is at time 0, and so on). Examining the features that
c@63	293 the plugin actually returns, when run on some test data, may make this
c@63	294 arrangement more clear.
c@63	295
c@63	296 Note that the underlying feature values are only returned if the
c@63	297 relevant feature type is selected. That is, the means and variances
c@63	298 outputs are valid provided the pure rhythm feature is not selected;
c@63	299 the beat spectra output is valid provided rhythm is included in the
c@63	300 selected feature type.
c@63	301
c@63	302
c@63	303 Constant-Q Spectrogram
c@63	304 ----------------------
c@63	305
c@63	306 Identifier: qm-constantq
c@63	307 Authors: Christian Landone
c@63	308 Category: Visualisation
c@63	309
c@63	310 References: J. Brown.
c@63	311 Calculation of a constant Q spectral transform.
c@63	312 Journal of the Acoustical Society of America, 89(1):
c@63	313 425-434, 1991.
c@63	314
c@63	315 The Constant-Q Spectrogram plugin calculates a spectrogram based on a
c@63	316 short-time windowed constant Q spectral transform. This is a
c@63	317 spectrogram in which the ratio of centre frequency to resolution is
c@63	318 constant for each frequency bin. The frequency bins correspond to the
c@63	319 frequencies of "musical notes" rather than being linearly spaced in
c@63	320 frequency as they are for the conventional DFT spectrogram.
c@63	321
c@63	322 The pitch range and the number of frequency bins per octave may be
c@63	323 adjusted using the plugin's parameters. Note that the plugin's
c@63	324 preferred step and block sizes depend on these parameters, and the
c@63	325 plugin will not accept any other block size.
c@63	326
c@63	327
c@63	328 Chromagram
c@63	329 ----------
c@63	330
c@63	331 Identifier: qm-chromagram
c@63	332 Authors: Christian Landone
c@63	333 Category: Visualisation
c@63	334
c@63	335 The Chromagram plugin calculates a constant Q spectral transform (as
c@63	336 above) and then wraps the frequency bin values into a single octave,
c@63	337 with each bin containing the sum of the magnitudes from the
c@63	338 corresponding bin in all octaves. The number of values in each
c@63	339 feature vector returned by the plugin is therefore the same as the
c@63	340 number of bins per octave configured for the underlying constant Q
c@63	341 transform.
c@63	342
c@63	343 The pitch range and the number of frequency bins per octave for the
c@63	344 transform may be adjusted using the plugin's parameters. Note that
c@63	345 the plugin's preferred step and block sizes depend on these
c@63	346 parameters, and the plugin will not accept any other block size.
c@63	347
c@63	348
c@63	349 Mel-Frequency Cepstral Coefficients
c@63	350 -----------------------------------
c@63	351
c@63	352 Identifier: qm-mfcc
c@63	353 Authors: Nicolas Chetry and Chris Cannam
c@63	354 Category: Low Level Features
c@63	355
c@63	356 References: B. Logan.
c@63	357 Mel-Frequency Cepstral Coefficients for Music Modeling.
c@63	358 In Proceedings of the First International Symposium on Music
c@63	359 Information Retrieval (ISMIR), 2000.
c@63	360
c@63	361 The Mel-Frequency Cepstral Coefficients plugin calculates MFCCs from a
c@63	362 single channel of audio, returning one MFCC vector from each process
c@63	363 call. It also returns the overall means of the coefficient values
c@63	364 across the length of the audio input, as a separate output at the end
c@63	365 of processing.
c@63	366

Mercurial > hg > qm-vamp-plugins

annotate README.txt.osx @ 63:b084e87b83e4