view README.txt.osx @ 80:e7c785094e7b

* update readmes &c
author Chris Cannam <c.cannam@qmul.ac.uk>
date Tue, 18 Nov 2008 14:51:49 +0000
parents b084e87b83e4
children
line wrap: on
line source

QM Vamp Plugins
===============

Vamp audio feature extraction plugins from Queen Mary, University of London.
Version 1.5.

For more information about Vamp plugins, see http://www.vamp-plugins.org/
and http://www.sonicvisualiser.org/ .


To Install
==========

This package contains plugins for the Apple OS/X operating system,
compatible with both PPC and Intel hardware.

To install them, copy the files

   qm-vamp-plugins.dylib and
   qm-vamp-plugins.cat

to either

   /Library/Audio/Plug-Ins/Vamp/ (for plugins available to all users) or
   $HOME/Library/Audio/Plug-Ins/Vamp/ (for plugins available to you only).


License
=======

These plugins are provided in binary form only.  You may install and
use the plugin binaries without fee for any purpose commercial or
non-commercial.  You may redistribute the plugin binaries provided you
do so without fee and you retain this README file with your
distribution.  You may not bundle these plugins with a commercial
product or distribute them on commercial terms.  If you wish to
arrange commercial licensing terms, please contact the Centre for
Digital Music at Queen Mary, University of London.

Copyright (c) 2006-2008 Queen Mary, University of London.  All rights
reserved except as described above.


About This Release
==================

This is a bugfix release only.  The plugins provided are unchanged
from 1.4.


Plugins Included
================

This plugin set includes the following plugins:

   * Note onset detector

   * Beat tracker and tempo estimator

   * Key estimator and tonal change detector

   * Segmenter, to divide a track into a consistent sequence of segments

   * Timbral and rhythmic similarity between audio tracks

   * Chromagram, constant-Q spectrogram, and MFCC calculation plugins

More details about the plugins follow.


Note Onset Detector
-------------------

 Identifier:    qm-onsetdetector
 Authors:       Chris Duxbury, Juan Pablo Bello and Christian Landone
 Category:      Time > Onsets

 References:    C. Duxbury, J. P. Bello, M. Davies and M. Sandler.
                Complex domain Onset Detection for Musical Signals.
                In Proceedings of the 6th Conference on Digital Audio
                Effects (DAFx-03). London, UK. September 2003.

                D. Stowell and M. D. Plumbley.
                Adaptive whitening for improved real-time audio onset
                detection.
                In Proceedings of the International Computer Music
                Conference (ICMC'07), August 2007.

                D. Barry, D. Fitzgerald, E. Coyle and B. Lawlor.
                Drum Source Separation using Percussive Feature
                Detection and Spectral Modulation.
                ISSC 2005

The Note Onset Detector plugin analyses a single channel of audio and
estimates the locations of note onsets within the music.

It calculates an onset likelihood function for each spectral frame,
and picks peaks in a smoothed version of this function.  The plugin is
non-causal, returning all results at the end of processing.

It has three outputs: the note onset positions, the onset detection
function used in estimating onset positions, and a smoothed version of
the detection function that is used in the peak-picking phase.


Tempo and Beat Tracker
----------------------

 Identifier:    qm-tempotracker
 Authors:       Matthew Davies and Christian Landone
 Category:      Time > Tempo

 References:    M. E. P. Davies and M. D. Plumbley.
                Context-dependent beat tracking of musical audio.
                In IEEE Transactions on Audio, Speech and Language
                Processing. Vol. 15, No. 3, pp1009-1020, 2007.

                M. E. P. Davies and M. D. Plumbley.
                Beat Tracking With A Two State Model.
                In Proceedings of the IEEE International Conference 
                on Acoustics, Speech and Signal Processing (ICASSP 2005),
                Vol. 3, pp241-244 Philadelphia, USA, March 19-23, 2005.

The Tempo and Beat Tracker plugin analyses a single channel of audio
and estimates the locations of metrical beats and the resulting tempo.

It has three outputs: the beat positions, an ongoing estimate of tempo
where available, and the onset detection function used in estimating
beat positions.


Key Detector
------------

 Identifier:    qm-keydetector
 Authors:       Katy Noland and Christian Landone
 Category:      Key and Tonality

 References:    K. Noland and M. Sandler.
                Signal Processing Parameters for Tonality Estimation.
                In Proceedings of Audio Engineering Society 122nd
                Convention, Vienna, 2007.

The Key Detector plugin analyses a single channel of audio and
continuously estimates the key of the music.

It has four outputs: the tonic pitch of the key; a major or minor mode
flag; the key (combining the tonic and major/minor into a single
value); and a key strength plot which reports the degree to which the
chroma vector extracted from each input block correlates to the stored
key profiles for each major and minor key.  The key profiles are drawn
from analysis of Book I of the Well Tempered Klavier by J S Bach,
recorded at A=440 equal temperament.

The outputs have the values:

  Tonic pitch: C = 1, C#/Db = 2, ..., B = 12

  Major/minor mode: major = 0, minor = 1

  Key: C major = 1, C#/Db major = 2, ..., B major = 12
       C minor = 13, C#/Db minor = 14, ..., B minor = 24

  Key Strength Plot: 25 separate bins per feature, separated into 1-12
       (major from C) and 14-25 (minor from C).  Bin 13 is unused, not
       for superstitious reasons but simply so as to delimit the major
       and minor areas if they are displayed on a single plot by the
       plugin host.  Higher bin values show increased correlation with
       the key profile for that key.

The outputs are also labelled with pitch or key as text.


Tonal Change
------------

 Identifier:    qm-tonalchange
 Authors:       Chris Harte and Martin Gasser
 Category:      Key and Tonality

 References:    C. A. Harte, M. Gasser, and M. Sandler.
                Detecting harmonic change in musical audio.
                In Proceedings of the 1st ACM workshop on Audio and Music
                Computing Multimedia, Santa Barbara, 2006.

                C. A. Harte and M. Sandler.
                Automatic chord identification using a quantised chromagram.
                In Proceedings of the 118th Convention of the Audio
                Engineering Society, Barcelona, Spain, May 28-31 2005.

The Tonal Change plugin analyses a single channel of audio, detecting
harmonic changes such as chord boundaries.

It has three outputs: a representation of the musical content in a
six-dimensional tonal space onto which the algorithm maps 12-bin
chroma vectors extracted from the audio; a function representing the
estimated likelihood of a tonal change occurring in each spectral
frame; and the resulting estimated positions of tonal changes.


Segmenter
---------

 Identifier:    qm-segmenter
 Authors:       Mark Levy
 Category:      Classification

 References:    M. Levy and M. Sandler.
                Structural segmentation of musical audio by constrained
                clustering.
                IEEE Transactions on Audio, Speech, and Language Processing,
                February 2008.

The Segmenter plugin divides a single channel of music up into
structurally consistent segments.  Its single output contains a
numeric value (the segment type) for each moment at which a new
segment starts.

For music with clearly tonally distinguishable sections such as verse,
chorus, etc., the segments with the same type may be expected to be
similar to one another in some structural sense (e.g. repetitions of
the chorus).

The type of feature used in segmentation can be selected using the
Feature Type parameter.  The default Hybrid (Constant-Q) is generally
effective for modern studio recordings, while the Chromatic option may
be preferable for live, acoustic, or older recordings, in which
repeated sections may be less consistent in sound.  Also available is
a timbral (MFCC) feature, which is more likely to result in
classification by instrumentation rather than musical content.

Note that this plugin does a substantial amount of processing after
receiving all of the input audio data, before it produces any results.


Similarity
----------

 Identifier:    qm-similarity
 Authors:       Mark Levy, Kurt Jacobson and Chris Cannam
 Category:      Classification

 References:    M. Levy and M. Sandler.
                Lightweight measures for timbral similarity of musical audio.
                In Proceedings of the 1st ACM workshop on Audio and Music
                Computing Multimedia, Santa Barbara, 2006.

                K. Jacobson.
                A Multifaceted Approach to Music Similarity.
                In Proceedings of the Seventh International Conference on
                Music Information Retrieval (ISMIR), 2006.

The Similarity plugin treats each channel of its audio input as a
separate "track", and estimates how similar the tracks are to one
another using a selectable similarity measure.

The plugin also returns the intermediate data used as a basis of the
similarity measure; it can therefore be used on a single channel of
input (with the resulting intermediate data then being applied in some
other similarity or clustering algorithm, for example) if desired, as
well as with multiple inputs.

The underlying audio features used for the similarity measure can be
selected using the Feature Type parameter.  The available features are
Timbre (in which the distance between tracks is a symmetrised
Kullback-Leibler divergence between Gaussian-modelled MFCC means and
variances across each track); Chroma (KL divergence of mean chroma
histogram); Rhythm (cosine distance between "beat spectrum" measures
derived from a short sampled section of the track); and combined
"Timbre and Rhythm" and "Chroma and Rhythm".

The plugin has six outputs: a matrix of the distances between input
channels; a vector containing the distances between the first input
channel and each of the input channels; a pair of vectors containing
the indices of the input channels in the order of their similarity to
the first input channel, and the distances between the first input
channel and each of those channels; the means of the underlying
feature bins (MFCCs or chroma); the variances of the underlying
feature bins; and the beat spectra used for the rhythmic feature.

Because Vamp does not have the capability to return features in matrix
form explicitly, the matrix output is returned as a series of vector
features timestamped at one-second intervals.  Likewise, the
underlying feature outputs contain one vector feature per input
channel, timestamped at one-second intervals (so the feature for the
first channel is at time 0, and so on).  Examining the features that
the plugin actually returns, when run on some test data, may make this
arrangement more clear.

Note that the underlying feature values are only returned if the
relevant feature type is selected.  That is, the means and variances
outputs are valid provided the pure rhythm feature is not selected;
the beat spectra output is valid provided rhythm is included in the
selected feature type.


Constant-Q Spectrogram
----------------------

 Identifier:    qm-constantq
 Authors:       Christian Landone
 Category:      Visualisation

 References:    J. Brown.
                Calculation of a constant Q spectral transform.
                Journal of the Acoustical Society of America, 89(1):
                425-434, 1991.

The Constant-Q Spectrogram plugin calculates a spectrogram based on a
short-time windowed constant Q spectral transform.  This is a
spectrogram in which the ratio of centre frequency to resolution is
constant for each frequency bin.  The frequency bins correspond to the
frequencies of "musical notes" rather than being linearly spaced in
frequency as they are for the conventional DFT spectrogram.

The pitch range and the number of frequency bins per octave may be
adjusted using the plugin's parameters.  Note that the plugin's
preferred step and block sizes depend on these parameters, and the
plugin will not accept any other block size.


Chromagram
----------

 Identifier:    qm-chromagram
 Authors:       Christian Landone
 Category:      Visualisation

The Chromagram plugin calculates a constant Q spectral transform (as
above) and then wraps the frequency bin values into a single octave,
with each bin containing the sum of the magnitudes from the
corresponding bin in all octaves.  The number of values in each
feature vector returned by the plugin is therefore the same as the
number of bins per octave configured for the underlying constant Q
transform.

The pitch range and the number of frequency bins per octave for the
transform may be adjusted using the plugin's parameters.  Note that
the plugin's preferred step and block sizes depend on these
parameters, and the plugin will not accept any other block size.


Mel-Frequency Cepstral Coefficients
-----------------------------------

 Identifier:    qm-mfcc
 Authors:       Nicolas Chetry and Chris Cannam
 Category:      Low Level Features

 References:    B. Logan.
                Mel-Frequency Cepstral Coefficients for Music Modeling.
                In Proceedings of the First International Symposium on Music
                Information Retrieval (ISMIR), 2000.

The Mel-Frequency Cepstral Coefficients plugin calculates MFCCs from a
single channel of audio, returning one MFCC vector from each process
call.  It also returns the overall means of the coefficient values
across the length of the audio input, as a separate output at the end
of processing.