Mercurial > hg > mirex2013

--- a/vamp-plugins_abstract/qmvamp-mirex2013.tex	Fri Sep 06 18:58:09 2013 +0100
+++ b/vamp-plugins_abstract/qmvamp-mirex2013.tex	Fri Sep 06 19:03:30 2013 +0100
@@ -64,6 +64,65 @@

 The Key Detector Vamp plugin was written by Katy Noland and Christian Landone.

+\section{Audio Chord Extraction}
+\label{chordino}
+The Chordino plugin was developed following Mauch's 2010 work on chord
+extraction as submitted to MIREX in that
+year\cite{mauch:md1:2010}. While that submission used a C++ chroma
+implementation with a MATLAB dynamic Bayesian network as a chord
+extraction front-end\cite{matthias2010a}, Chordino is an entirely C++
+implementation that was developed specifically to be made freely
+available as an open-source plugin for general use.
+
+The method for the Chordino plugin has two parts:
+
+\subsection{NNLS Chroma}
+
+NNLS Chroma analyses a single channel of audio using frame-wise
+spectral input from the Vamp host. The spectrum is transformed to a
+log-frequency spectrum (constant-Q) with three bins per semitone. On
+this representation, two processing steps are performed: tuning, after
+which each centre bin (i.e. bin 2, 5, 8, …) corresponds to a semitone,
+even if the tuning of the piece deviates from 440 Hz standard pitch;
+and running standardisation: subtraction of the running mean, division
+by the running standard deviation. This has a spectral whitening
+effect.
+
+The processed log-frequency spectrum is then used as an input for NNLS
+approximate transcription using a dictionary of harmonic notes with
+geometrically decaying harmonics magnitudes. The output of the NNLS
+approximate transcription is semitone-spaced. To get the chroma, this
+semitone spectrum is multiplied (element-wise) with the desired
+profile (chroma or bass chroma) and then mapped to 12 bins.
+
+\subsection{Chord transcription}
+
+A fixed dictionary of chord profiles is used to calculate frame-wise
+chord similarities. A standard HMM/Viterbi approach is used to smooth
+these to provide a chord transcription.
+
+\section{Structural Segmentation}
+
+A beat-quantised chroma representation is used to calculate pair-wise
+similarities between beats (really: beat ``shingles'', i.e. multi-beat
+vectors). Based on this first similarity calculation, an exhaustive
+comparison of all possible segments of reasonable length in beats is
+executed, and segments are added to form segment families if they are
+sufficiently similar to another ``family member''. Having accumulated a
+lot of families, the families are rated, and the one with the highest
+score is used as the first segmentation group that gets
+annotated. This last step is repeated until no more families fit the
+remaining ``holes'' in the song that haven't already been assigned to a
+segment.
+
+This method was developed for ``classic rock'' music, and therefore
+assumes a few characteristics that are not necessarily found in other
+music: repetition of harmonic sequences in the music that coincide
+with structural segments in a song; a steady beat; segments of a
+certain length; corresponding segments have the same length in
+beats.
+
+
 \section{Audio Onset Detection}

 The Note Onset Detector Vamp plugin analyses a single channel of audio and estimates the onset times of notes within the music -- that is, the times at which notes and other audible events begin.