Mercurial > hg > mirex2013
changeset 42:c4c2b7f297a4 abstract
Adding some material on Chordino
author | luisf <luis.figueira@eecs.qmul.ac.uk> |
---|---|
date | Fri, 06 Sep 2013 19:03:30 +0100 |
parents | 65dd3c90a571 |
children | 52d237639e16 |
files | vamp-plugins_abstract/qmvamp-mirex2013.tex |
diffstat | 1 files changed, 59 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- a/vamp-plugins_abstract/qmvamp-mirex2013.tex Fri Sep 06 18:58:09 2013 +0100 +++ b/vamp-plugins_abstract/qmvamp-mirex2013.tex Fri Sep 06 19:03:30 2013 +0100 @@ -64,6 +64,65 @@ The Key Detector Vamp plugin was written by Katy Noland and Christian Landone. +\section{Audio Chord Extraction} +\label{chordino} +The Chordino plugin was developed following Mauch's 2010 work on chord +extraction as submitted to MIREX in that +year\cite{mauch:md1:2010}. While that submission used a C++ chroma +implementation with a MATLAB dynamic Bayesian network as a chord +extraction front-end\cite{matthias2010a}, Chordino is an entirely C++ +implementation that was developed specifically to be made freely +available as an open-source plugin for general use. + +The method for the Chordino plugin has two parts: + +\subsection{NNLS Chroma} + +NNLS Chroma analyses a single channel of audio using frame-wise +spectral input from the Vamp host. The spectrum is transformed to a +log-frequency spectrum (constant-Q) with three bins per semitone. On +this representation, two processing steps are performed: tuning, after +which each centre bin (i.e. bin 2, 5, 8, …) corresponds to a semitone, +even if the tuning of the piece deviates from 440 Hz standard pitch; +and running standardisation: subtraction of the running mean, division +by the running standard deviation. This has a spectral whitening +effect. + +The processed log-frequency spectrum is then used as an input for NNLS +approximate transcription using a dictionary of harmonic notes with +geometrically decaying harmonics magnitudes. The output of the NNLS +approximate transcription is semitone-spaced. To get the chroma, this +semitone spectrum is multiplied (element-wise) with the desired +profile (chroma or bass chroma) and then mapped to 12 bins. + +\subsection{Chord transcription} + +A fixed dictionary of chord profiles is used to calculate frame-wise +chord similarities. A standard HMM/Viterbi approach is used to smooth +these to provide a chord transcription. + +\section{Structural Segmentation} + +A beat-quantised chroma representation is used to calculate pair-wise +similarities between beats (really: beat ``shingles'', i.e. multi-beat +vectors). Based on this first similarity calculation, an exhaustive +comparison of all possible segments of reasonable length in beats is +executed, and segments are added to form segment families if they are +sufficiently similar to another ``family member''. Having accumulated a +lot of families, the families are rated, and the one with the highest +score is used as the first segmentation group that gets +annotated. This last step is repeated until no more families fit the +remaining ``holes'' in the song that haven't already been assigned to a +segment. + +This method was developed for ``classic rock'' music, and therefore +assumes a few characteristics that are not necessarily found in other +music: repetition of harmonic sequences in the music that coincide +with structural segments in a song; a steady beat; segments of a +certain length; corresponding segments have the same length in +beats. + + \section{Audio Onset Detection} The Note Onset Detector Vamp plugin analyses a single channel of audio and estimates the onset times of notes within the music -- that is, the times at which notes and other audible events begin.