Mercurial > hg > qm-vamp-plugins
changeset 48:3b4572153ce3
* Similarity -> single user control rather than separate weighting
* Key detector -> correct reported min/max values for outputs
* Start some documentation
author | Chris Cannam <c.cannam@qmul.ac.uk> |
---|---|
date | Mon, 21 Jan 2008 18:05:28 +0000 |
parents | f8c5f11e60a6 |
children | fc88b465548a |
files | README.txt plugins/BeatTrack.cpp plugins/KeyDetect.cpp plugins/SimilarityPlugin.cpp qm-vamp-plugins.cat qm-vamp-plugins.pro |
diffstat | 6 files changed, 320 insertions(+), 45 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.txt Mon Jan 21 18:05:28 2008 +0000 @@ -0,0 +1,225 @@ + +QM Vamp Plugins +=============== + +Vamp audio feature extraction plugins from Queen Mary, University of London. +Version 1.4. + +For more information about Vamp plugins, see http://www.vamp-plugins.org/ +and http://www.sonicvisualiser.org/. + + +New In This Release +=================== + +This release contains a new plugin to estimate timbral and rhythmic +similarity between multiple audio tracks, a plugin for structural +segmentation of music audio, and a Mel-frequency cepstral coefficients +calculation plugin. + +This release also includes fixes to the existing key detector and +chromagram plugins. + + +To Install +========== + +Installation depends on your operating system. + + Windows -> Copy qm-vamp-plugins.dll and qm-vamp-plugins.cat to + C:\Program Files\Vamp Plugins\ + + OS/X -> Copy qm-vamp-plugins.dylib and qm-vamp-plugins.cat to + $HOME/Library/Audio/Plug-Ins/Vamp/ + or + /Library/Audio/Plug-Ins/Vamp/ + + Linux -> Copy qm-vamp-plugins.so and qm-vamp-plugins.cat to + $HOME/vamp/ + or + /usr/local/lib/vamp/ + or + /usr/lib/vamp/ + + +Plugins Included +================ + +This plugin set includes the following plugins: + + * Note onset detector + * Beat tracker and tempo estimator + * Key estimator and tonal change detector + * Segmenter, to divide a track into a consistent sequence of segments + * Timbral and rhythmic similarity between audio tracks + * Chromagram, constant-Q spectrogram, and MFCC calculation plugins + +More details about the plugins follow. + + +Note Onset Detector +------------------- + +Identifier: qm-onsetdetector +Authors: Chris Duxbury, Juan Pablo Bello and Christian Landone +Category: Time > Onsets + +References: C. Duxbury, J. P. Bello, M. Davies and M. Sandler. + Complex domain Onset Detection for Musical Signals. + In Proceedings of the 6th Conference on Digital Audio + Effects (DAFx-03). London, UK. September 2003. + + D. Stowell and M. D. Plumbley. + Adaptive whitening for improved real-time audio onset + detection. + In Proceedings of the International Computer Music + Conference (ICMC'07), August 2007. + + D. Barry, D. Fitzgerald, E. Coyle and B. Lawlor. + Drum Source Separation using Percussive Feature + Detection and Spectral Modulation. + ISSC 2005 + +This plugin analyses a single channel of audio and estimates the +locations of note onsets within the music. + +It has three outputs: the note onset positions, the onset detection +function used in estimating onset positions, and a smoothed version of +the detection function that is used in the peak-picking phase. + + +Tempo and Beat Tracker +---------------------- + +Identifier: qm-tempotracker +Authors: Matthew Davies and Christian Landone +Category: Time > Tempo + +References: M. E. P. Davies and M. D. Plumbley. + Context-dependent beat tracking of musical audio. + Technical Report C4DM-TR-06-02. 5 April 2006. + + M. E. P. Davies and M. D. Plumbley. + Beat Tracking With A Two State Model. + In Proceedings of the IEEE International Conference + on Acoustics, Speech and Signal Processing (ICASSP 2005), + Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005. + +This plugin analyses a single channel of audio and estimates the +locations of metrical beats and the resulting tempo of the music. + +It has three outputs: the beat positions, an ongoing estimate of tempo +where available, and the onset detection function used in estimating +beat positions. + + + +Key Detector +------------ + +Identifier: qm-keydetector +Authors: Katy Noland and Christian Landone +Category: Key and Tonality + +References: K. Noland and M. Sandler. + Signal Processing Parameters for Tonality Estimation. + In Proceedings of Audio Engineering Society 122nd Convention, + Vienna, 2007. + +This plugin analyses a single channel of audio and continuously +estimates the key of the music. + +It has three outputs: the tonic pitch of the key; a major or minor +mode flag; and key (combining the tonic and major/minor into a single +value). These outputs have the values: + + Tonic pitch: C = 1, C#/Db = 2, ..., B = 12 + Major/minor mode: major = 0, minor = 1 + Key: C major = 1, C#/Db major = 2, ..., B major = 12 + C minor = 13, C#/Db minor = 14, ..., B minor = 24 + +The outputs are also labelled with pitch or key as text. + + +Tonal Change +------------ + +Identifier: qm-tonalchange +Authors: Chris Harte and Martin Gasser +Category: Key and Tonality + +References: C. A. Harte and M. Sandler. + Automatic chord identification using a quantised chromagram. + In Proceedings of the 118th Convention of the Audio + Engineering Society, Barcelona, Spain, May 28-31 2005. + + +Segmenter +--------- + +Identifier: qm-segmenter +Authors: Mark Levy +Category: Classification + +References: M. Levy and M. Sandler. + Structural segmentation of musical audio by constrained + clustering. + IEEE Transactions on Audio, Speech, and Language Processing, + February 2008. + + + +Similarity +---------- + +Identifier: qm-similarity +Authors: Mark Levy, Kurt Jacobson and Chris Cannam +Category: Classification + +References: M. Levy and M. Sandler. + Lightweight measures for timbral similarity of musical audio. + In Proceedings of the 1st ACM workshop on Audio and Music + Computing Multimedia, Santa Barbara, 2006. + + K. Jacobson. + A Multifaceted Approach to Music Similarity. + In Proceedings of the Seventh International Conference on Music + Information Retrieval (ISMIR), 2006. + + +Constant-Q Spectrogram +---------------------- + +Identifier: qm-constantq +Authors: Christian Landone +Category: Visualisation + +References: J. Brown. + Calculation of a constant Q spectral transform. + Journal of the Acoustical Society of America, 89(1): + 425-434, 1991. + + +Chromagram +---------- + +Identifier: qm-chromagram +Authors: Christian Landone +Category: Visualisation + +References: + + +Mel-Frequency Cepstral Coefficients +----------------------------------- + +Identifier: qm-mfcc +Authors: Nicolas Chetry and Chris Cannam +Category: Low Level Features + +References: B. Logan. + Mel-Frequency Cepstral Coefficients for Music Modeling. + In Proceedings of the First International Symposium on Music + Information Retrieval (ISMIR), 2000. + +
--- a/plugins/BeatTrack.cpp Fri Jan 18 18:11:01 2008 +0000 +++ b/plugins/BeatTrack.cpp Mon Jan 21 18:05:28 2008 +0000 @@ -87,7 +87,7 @@ string BeatTracker::getCopyright() const { - return "Copyright (c) 2006-2007 - All Rights Reserved"; + return "Copyright (c) 2006-2008 - All Rights Reserved"; } BeatTracker::ParameterList
--- a/plugins/KeyDetect.cpp Fri Jan 18 18:11:01 2008 +0000 +++ b/plugins/KeyDetect.cpp Mon Jan 21 18:05:28 2008 +0000 @@ -196,8 +196,8 @@ d.binCount = 1; d.hasKnownExtents = true; d.isQuantized = true; - d.minValue = 0; - d.maxValue = 11; + d.minValue = 1; + d.maxValue = 12; d.quantizeStep = 1; d.sampleType = OutputDescriptor::OneSamplePerStep; list.push_back(d); @@ -223,8 +223,8 @@ d.binCount = 1; d.hasKnownExtents = true; d.isQuantized = true; - d.minValue = 0; - d.maxValue = 23; + d.minValue = 1; + d.maxValue = 24; d.quantizeStep = 1; d.binNames.erase(d.binNames.begin(),d.binNames.end()); d.sampleType = OutputDescriptor::OneSamplePerStep; @@ -328,8 +328,8 @@ const char * KeyDetector::getKeyName(int index) { - // Keys are numbered with 1 => C, 12 => B - // This is based on chromagram base set to a C in qm-dsp's GetKeyMode.cpp + // Keys are numbered with 1 => C, 12 => B + // This is based on chromagram base set to a C in qm-dsp's GetKeyMode.cpp static const char *names[] = { "C", "C# / Db", "D", "D# / Eb", "E", "F", "F# / Gb", "G",
--- a/plugins/SimilarityPlugin.cpp Fri Jan 18 18:11:01 2008 +0000 +++ b/plugins/SimilarityPlugin.cpp Mon Jan 21 18:05:28 2008 +0000 @@ -39,7 +39,7 @@ m_chromagram(0), m_decimator(0), m_featureColumnSize(20), - m_rhythmWeighting(0.f), + m_rhythmWeighting(0.5f), m_rhythmClipDuration(4.f), // seconds m_rhythmClipOrigin(40.f), // seconds m_rhythmClipFrameSize(0), @@ -290,17 +290,20 @@ ParameterDescriptor desc; desc.identifier = "featureType"; desc.name = "Feature Type"; - desc.description = "Audio feature used for similarity measure. Timbral: use the first 20 MFCCs (19 plus C0). Chromatic: use 12 bin-per-octave chroma."; + desc.description = "Audio feature used for similarity measure. Timbral: use the first 20 MFCCs (19 plus C0). Chromatic: use 12 bin-per-octave chroma. Rhythmic: compare beat spectra of short regions."; desc.unit = ""; desc.minValue = 0; - desc.maxValue = 1; - desc.defaultValue = 0; + desc.maxValue = 4; + desc.defaultValue = 1; desc.isQuantized = true; desc.quantizeStep = 1; - desc.valueNames.push_back("Timbral (MFCC)"); - desc.valueNames.push_back("Chromatic (Chroma)"); + desc.valueNames.push_back("Timbre"); + desc.valueNames.push_back("Timbre and Rhythm"); + desc.valueNames.push_back("Chroma"); + desc.valueNames.push_back("Chroma and Rhythm"); + desc.valueNames.push_back("Rhythm only"); list.push_back(desc); - +/* desc.identifier = "rhythmWeighting"; desc.name = "Influence of Rhythm"; desc.description = "Proportion of similarity measure made up from rhythmic similarity component, from 0 (entirely timbral or chromatic) to 100 (entirely rhythmic)."; @@ -308,11 +311,10 @@ desc.minValue = 0; desc.maxValue = 100; desc.defaultValue = 0; - desc.isQuantized = true; - desc.quantizeStep = 1; + desc.isQuantized = false; desc.valueNames.clear(); list.push_back(desc); - +*/ return list; } @@ -320,11 +322,28 @@ SimilarityPlugin::getParameter(std::string param) const { if (param == "featureType") { - if (m_type == TypeMFCC) return 0; - else if (m_type == TypeChroma) return 1; - else return 0; - } else if (param == "rhythmWeighting") { - return nearbyint(m_rhythmWeighting * 100.0); + + if (m_rhythmWeighting > m_allRhythm) { + return 4; + } + + switch (m_type) { + + case TypeMFCC: + if (m_rhythmWeighting < m_noRhythm) return 0; + else return 1; + break; + + case TypeChroma: + if (m_rhythmWeighting < m_noRhythm) return 2; + else return 3; + break; + } + + return 1; + +// } else if (param == "rhythmWeighting") { +// return nearbyint(m_rhythmWeighting * 100.0); } std::cerr << "WARNING: SimilarityPlugin::getParameter: unknown parameter \"" @@ -336,15 +355,27 @@ SimilarityPlugin::setParameter(std::string param, float value) { if (param == "featureType") { + int v = int(value + 0.1); - Type prevType = m_type; - if (v == 0) m_type = TypeMFCC; - else if (v == 1) m_type = TypeChroma; - if (m_type != prevType) m_blockSize = 0; + + Type newType = m_type; + + switch (v) { + case 0: newType = TypeMFCC; m_rhythmWeighting = 0.0f; break; + case 1: newType = TypeMFCC; m_rhythmWeighting = 0.5f; break; + case 2: newType = TypeChroma; m_rhythmWeighting = 0.0f; break; + case 3: newType = TypeChroma; m_rhythmWeighting = 0.5f; break; + case 4: newType = TypeMFCC; m_rhythmWeighting = 1.f; break; + } + + if (newType != m_type) m_blockSize = 0; + + m_type = newType; return; - } else if (param == "rhythmWeighting") { - m_rhythmWeighting = value / 100; - return; + +// } else if (param == "rhythmWeighting") { +// m_rhythmWeighting = value / 100; +// return; } std::cerr << "WARNING: SimilarityPlugin::setParameter: unknown parameter \"" @@ -629,22 +660,40 @@ v[i] = variance; } - // "Despite the fact that MFCCs extracted from music are clearly - // not Gaussian, [14] showed, somewhat surprisingly, that a - // similarity function comparing single Gaussians modelling MFCCs - // for each track can perform as well as mixture models. A great - // advantage of using single Gaussians is that a simple closed - // form exists for the KL divergence." -- Mark Levy, "Lightweight - // measures for timbral similarity of musical audio" - // (http://www.elec.qmul.ac.uk/easaier/papers/mlevytimbralsimilarity.pdf) - - KLDivergence kld; FeatureMatrix distances(m_channels); - for (int i = 0; i < m_channels; ++i) { - for (int j = 0; j < m_channels; ++j) { - double d = kld.distance(m[i], v[i], m[j], v[j]); - distances[i].push_back(d); + if (m_type == TypeMFCC) { + + // "Despite the fact that MFCCs extracted from music are + // clearly not Gaussian, [14] showed, somewhat surprisingly, + // that a similarity function comparing single Gaussians + // modelling MFCCs for each track can perform as well as + // mixture models. A great advantage of using single + // Gaussians is that a simple closed form exists for the KL + // divergence." -- Mark Levy, "Lightweight measures for + // timbral similarity of musical audio" + // (http://www.elec.qmul.ac.uk/easaier/papers/mlevytimbralsimilarity.pdf) + + KLDivergence kld; + + for (int i = 0; i < m_channels; ++i) { + for (int j = 0; j < m_channels; ++j) { + double d = kld.distanceGaussian(m[i], v[i], m[j], v[j]); + distances[i].push_back(d); + } + } + + } else { + + // Chroma are histograms already + + KLDivergence kld; + + for (int i = 0; i < m_channels; ++i) { + for (int j = 0; j < m_channels; ++j) { + double d = kld.distanceDistribution(m[i], m[j], true); + distances[i].push_back(d); + } } }
--- a/qm-vamp-plugins.cat Fri Jan 18 18:11:01 2008 +0000 +++ b/qm-vamp-plugins.cat Mon Jan 21 18:05:28 2008 +0000 @@ -4,5 +4,6 @@ vamp:qm-vamp-plugins:qm-constantq::Visualisation vamp:qm-vamp-plugins:qm-tonalchange::Key and Tonality vamp:qm-vamp-plugins:qm-keydetector::Key and Tonality -vamp:qm-vamp-plugins:qm-segmenter::Key and Tonality +vamp:qm-vamp-plugins:qm-segmenter::Classification vamp:qm-vamp-plugins:qm-similarity::Classification +vamp-qm-vamp-plugins:qm-mfcc::Low Level Features
--- a/qm-vamp-plugins.pro Fri Jan 18 18:11:01 2008 +0000 +++ b/qm-vamp-plugins.pro Mon Jan 21 18:05:28 2008 +0000 @@ -1,7 +1,7 @@ TEMPLATE = lib -CONFIG += plugin warn_on release +CONFIG += plugin warn_on debug CONFIG -= qt linux-g++:QMAKE_CXXFLAGS_RELEASE += -DNDEBUG -O3 -march=pentium4 -msse -msse2