Mercurial > hg > qm-vamp-plugins
changeset 128:9949881120a4
* Remove inline docs in README; replace with pointer to online docs (will
be more accurate)
author | Chris Cannam <c.cannam@qmul.ac.uk> |
---|---|
date | Tue, 30 Jun 2009 14:50:08 +0000 |
parents | fb4688d2cca5 |
children | 1a24b134cd79 |
files | README.txt |
diffstat | 1 files changed, 3 insertions(+), 387 deletions(-) [+] |
line wrap: on
line diff
--- a/README.txt Tue Jun 30 10:15:01 2009 +0000 +++ b/README.txt Tue Jun 30 14:50:08 2009 +0000 @@ -43,7 +43,9 @@ * Chromagram, constant-Q spectrogram, and MFCC calculation plugins -More details about the plugins follow. +For full details about the plugins, with references, please see + + http://vamp-plugins.org/plugin-doc/qm-vamp-plugins.html License @@ -82,389 +84,3 @@ or /usr/lib/vamp/ - -The Plugins -=========== - -Note Onset Detector -------------------- - - Identifier: qm-onsetdetector - Authors: Chris Duxbury, Juan Pablo Bello and Christian Landone - Category: Time > Onsets - - References: C. Duxbury, J. P. Bello, M. Davies and M. Sandler. - Complex domain Onset Detection for Musical Signals. - In Proceedings of the 6th Conference on Digital Audio - Effects (DAFx-03). London, UK. September 2003. - - D. Stowell and M. D. Plumbley. - Adaptive whitening for improved real-time audio onset - detection. - In Proceedings of the International Computer Music - Conference (ICMC'07), August 2007. - - D. Barry, D. Fitzgerald, E. Coyle and B. Lawlor. - Drum Source Separation using Percussive Feature - Detection and Spectral Modulation. - ISSC 2005 - -The Note Onset Detector plugin analyses a single channel of audio and -estimates the locations of note onsets within the music. - -It calculates an onset likelihood function for each spectral frame, -and picks peaks in a smoothed version of this function. The plugin is -non-causal, returning all results at the end of processing. - -It has three outputs: the note onset positions, the onset detection -function used in estimating onset positions, and a smoothed version of -the detection function that is used in the peak-picking phase. - - -Tempo and Beat Tracker ----------------------- - - Identifier: qm-tempotracker - Authors: Matthew Davies and Christian Landone - Category: Time > Tempo - - References: M. E. P. Davies and M. D. Plumbley. - Context-dependent beat tracking of musical audio. - In IEEE Transactions on Audio, Speech and Language - Processing. Vol. 15, No. 3, pp1009-1020, 2007. - - M. E. P. Davies and M. D. Plumbley. - Beat Tracking With A Two State Model. - In Proceedings of the IEEE International Conference - on Acoustics, Speech and Signal Processing (ICASSP 2005), - Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005. - -The Tempo and Beat Tracker plugin analyses a single channel of audio -and estimates the locations of metrical beats and the resulting tempo. - -It has three outputs: the beat positions, an ongoing estimate of tempo -where available, and the onset detection function used in estimating -beat positions. - - -Key Detector ------------- - - Identifier: qm-keydetector - Authors: Katy Noland and Christian Landone - Category: Key and Tonality - - References: K. Noland and M. Sandler. - Signal Processing Parameters for Tonality Estimation. - In Proceedings of Audio Engineering Society 122nd - Convention, Vienna, 2007. - -The Key Detector plugin analyses a single channel of audio and -continuously estimates the key of the music. - -It has four outputs: the tonic pitch of the key; a major or minor mode -flag; the key (combining the tonic and major/minor into a single -value); and a key strength plot which reports the degree to which the -chroma vector extracted from each input block correlates to the stored -key profiles for each major and minor key. The key profiles are drawn -from analysis of Book I of the Well Tempered Klavier by J S Bach, -recorded at A=440 equal temperament. - -The outputs have the values: - - Tonic pitch: C = 1, C#/Db = 2, ..., B = 12 - - Major/minor mode: major = 0, minor = 1 - - Key: C major = 1, C#/Db major = 2, ..., B major = 12 - C minor = 13, C#/Db minor = 14, ..., B minor = 24 - - Key Strength Plot: 25 separate bins per feature, separated into 1-12 - (major from C) and 14-25 (minor from C). Bin 13 is unused, not - for superstitious reasons but simply so as to delimit the major - and minor areas if they are displayed on a single plot by the - plugin host. Higher bin values show increased correlation with - the key profile for that key. - -The outputs are also labelled with pitch or key as text. - - -Tonal Change ------------- - - Identifier: qm-tonalchange - Authors: Chris Harte and Martin Gasser - Category: Key and Tonality - - References: C. A. Harte, M. Gasser, and M. Sandler. - Detecting harmonic change in musical audio. - In Proceedings of the 1st ACM workshop on Audio and Music - Computing Multimedia, Santa Barbara, 2006. - - C. A. Harte and M. Sandler. - Automatic chord identification using a quantised chromagram. - In Proceedings of the 118th Convention of the Audio - Engineering Society, Barcelona, Spain, May 28-31 2005. - -The Tonal Change plugin analyses a single channel of audio, detecting -harmonic changes such as chord boundaries. - -It has three outputs: a representation of the musical content in a -six-dimensional tonal space onto which the algorithm maps 12-bin -chroma vectors extracted from the audio; a function representing the -estimated likelihood of a tonal change occurring in each spectral -frame; and the resulting estimated positions of tonal changes. - - -Segmenter ---------- - - Identifier: qm-segmenter - Authors: Mark Levy - Category: Classification - - References: M. Levy and M. Sandler. - Structural segmentation of musical audio by constrained - clustering. - IEEE Transactions on Audio, Speech, and Language Processing, - February 2008. - -The Segmenter plugin divides a single channel of music up into -structurally consistent segments. Its single output contains a -numeric value (the segment type) for each moment at which a new -segment starts. - -For music with clearly tonally distinguishable sections such as verse, -chorus, etc., the segments with the same type may be expected to be -similar to one another in some structural sense (e.g. repetitions of -the chorus). - -The type of feature used in segmentation can be selected using the -Feature Type parameter. The default Hybrid (Constant-Q) is generally -effective for modern studio recordings, while the Chromatic option may -be preferable for live, acoustic, or older recordings, in which -repeated sections may be less consistent in sound. Also available is -a timbral (MFCC) feature, which is more likely to result in -classification by instrumentation rather than musical content. - -Note that this plugin does a substantial amount of processing after -receiving all of the input audio data, before it produces any results. - - -Similarity ----------- - - Identifier: qm-similarity - Authors: Mark Levy, Kurt Jacobson and Chris Cannam - Category: Classification - - References: M. Levy and M. Sandler. - Lightweight measures for timbral similarity of musical audio. - In Proceedings of the 1st ACM workshop on Audio and Music - Computing Multimedia, Santa Barbara, 2006. - - K. Jacobson. - A Multifaceted Approach to Music Similarity. - In Proceedings of the Seventh International Conference on - Music Information Retrieval (ISMIR), 2006. - -The Similarity plugin treats each channel of its audio input as a -separate "track", and estimates how similar the tracks are to one -another using a selectable similarity measure. - -The plugin also returns the intermediate data used as a basis of the -similarity measure; it can therefore be used on a single channel of -input (with the resulting intermediate data then being applied in some -other similarity or clustering algorithm, for example) if desired, as -well as with multiple inputs. - -The underlying audio features used for the similarity measure can be -selected using the Feature Type parameter. The available features are -Timbre (in which the distance between tracks is a symmetrised -Kullback-Leibler divergence between Gaussian-modelled MFCC means and -variances across each track); Chroma (KL divergence of mean chroma -histogram); Rhythm (cosine distance between "beat spectrum" measures -derived from a short sampled section of the track); and combined -"Timbre and Rhythm" and "Chroma and Rhythm". - -The plugin has six outputs: a matrix of the distances between input -channels; a vector containing the distances between the first input -channel and each of the input channels; a pair of vectors containing -the indices of the input channels in the order of their similarity to -the first input channel, and the distances between the first input -channel and each of those channels; the means of the underlying -feature bins (MFCCs or chroma); the variances of the underlying -feature bins; and the beat spectra used for the rhythmic feature. - -Because Vamp does not have the capability to return features in matrix -form explicitly, the matrix output is returned as a series of vector -features timestamped at one-second intervals. Likewise, the -underlying feature outputs contain one vector feature per input -channel, timestamped at one-second intervals (so the feature for the -first channel is at time 0, and so on). Examining the features that -the plugin actually returns, when run on some test data, may make this -arrangement more clear. - -Note that the underlying feature values are only returned if the -relevant feature type is selected. That is, the means and variances -outputs are valid provided the pure rhythm feature is not selected; -the beat spectra output is valid provided rhythm is included in the -selected feature type. - - -Constant-Q Spectrogram ----------------------- - - Identifier: qm-constantq - Authors: Christian Landone - Category: Visualisation - - References: J. Brown. - Calculation of a constant Q spectral transform. - Journal of the Acoustical Society of America, 89(1): - 425-434, 1991. - -The Constant-Q Spectrogram plugin calculates a spectrogram based on a -short-time windowed constant Q spectral transform. This is a -spectrogram in which the ratio of centre frequency to resolution is -constant for each frequency bin. The frequency bins correspond to the -frequencies of "musical notes" rather than being linearly spaced in -frequency as they are for the conventional DFT spectrogram. - -The pitch range and the number of frequency bins per octave may be -adjusted using the plugin's parameters. Note that the plugin's -preferred step and block sizes depend on these parameters, and the -plugin will not accept any other block size. - - -Chromagram ----------- - - Identifier: qm-chromagram - Authors: Christian Landone - Category: Visualisation - -The Chromagram plugin calculates a constant Q spectral transform (as -above) and then wraps the frequency bin values into a single octave, -with each bin containing the sum of the magnitudes from the -corresponding bin in all octaves. The number of values in each -feature vector returned by the plugin is therefore the same as the -number of bins per octave configured for the underlying constant Q -transform. - -The pitch range and the number of frequency bins per octave for the -transform may be adjusted using the plugin's parameters. Note that -the plugin's preferred step and block sizes depend on these -parameters, and the plugin will not accept any other block size. - - -Mel-Frequency Cepstral Coefficients ------------------------------------ - - Identifier: qm-mfcc - Authors: Nicolas Chetry and Chris Cannam - Category: Low Level Features - - References: B. Logan. - Mel-Frequency Cepstral Coefficients for Music Modeling. - In Proceedings of the First International Symposium on Music - Information Retrieval (ISMIR), 2000. - -The Mel-Frequency Cepstral Coefficients plugin calculates MFCCs from a -single channel of audio, returning one MFCC vector from each process -call. It also returns the overall means of the coefficient values -across the length of the audio input, as a separate output at the end -of processing. - - -Polyphonic Transcription ------------------------- - - Identifier: qm-transcription - Author: Ruohua Zhou - Category: Notes - - References: R. Zhou and J. D. Reiss. - A Real-Time Polyphonic Music Transcription System. - In Proceedings of the Fourth Music Information Retrieval - Evaluation eXchange (MIREX), Philadelphia, USA, 2008 - - R. Zhou and J. D. Reiss. - A Real-Time Frame-Based Multiple Pitch Estimation - Method Using the Resonator Time Frequency Image. - Third Music Information Retrieval Evaluation eXchange - (MIREX), Vienna, Austria, 2007 - -The Polyphonic Transcription plugin estimates a note transcription -using MIDI pitch values from its input audio, returning a feature for -each note (with timestamp and duration) whose value is the MIDI pitch -number. Velocity is not estimated. - -This plugin requires a host with Vamp 2.0 support in order to return -durations properly. - -Although the published description of the method is described as -real-time, the implementation used in this plugin is non-causal; it -buffers its input to operate on in a single unit, doing all the real -work after its entire input has been received, and is very memory -intensive. However, it is relatively fast (faster than real-time) -compared to other polyphonic transcription methods. - -The plugin works best at 44.1KHz input sample rate, and is tuned for -piano and guitar music. - - -Adaptive Spectrogram --------------------- - - Identifier: qm-adaptivespectrogram - Authors: Wen Xue and Chris Cannam - Category: Visualisation - - References: X. Wen and M. Sandler. - Composite spectrogram using multiple Fourier transforms. - IET Signal Processing Journal, January 2009 - -The Adaptive Spectrogram plugin produces a composite spectrogram from -a set of series of short-time Fourier transforms at differing resolutions. - - -Discrete Wavelet Transform --------------------------- - - Identifier: qm-dwt - Author: Thomas Wilmering - Category: Visualisation - - References: S. Mallat. - A theory for multiresolution signal decomposition: the wavelet - representation. - In IEEE Transactions on Pattern Analysis and Machine - Intelligence, 11 (1989), pp. 674-693. - - P. Rajmic and J. Vlach. - Real-Time Audio Processing via Segmented Wavelet Transform. - In Proceedings of the 10th Int. Conference on Digital Audio - Effects (DAFx-07), Bordeaux, France, September 10-15, 2007. - -The Discrete Wavelet Transform plugin performs the forward DWT on the -signal. The wavelet coefficients are derived from a fast segmented DWT -algorithm without block end effects. The DWT can be performed with -various functions from a selection of wavelets up to the 16th scale. - -The wavelet coefficients are returned as feature columns at a rate of -half the sample rate of the signal to be analysed. To simulate -multiresolution in the layer data table, the coefficient values at -higher scales are copied multiple times according to the number of the -scale. For example, for scale 2 each value will appear twice, at scale -3 they will be appear four times, at scale 4 there will be 8 times the -same coefficient value in order to simulate the lower resolution at -higher scales. - -The Scales parameter adjusts the number of scales of the DWT. The -blocksize needs to be set to at least 2^n, where n = number of scales. - -The Wavelet parameter selects the wavelet function to be used for the -transform. Wavelets from the following families are available: -Daubechies, Symlets, Coiflets, Biorthogonal, Meyer. -