Mercurial > hg > nnls-chroma
view README @ 44:109d3b2c7105 matthiasm-plugin
regarding the chord estimation:\n * tweaked chord templates\n * that means that the original method also changed
author | matthiasm |
---|---|
date | Mon, 25 Oct 2010 01:58:37 +0900 |
parents | d6bb9b43ac1c |
children | 976833b7a463 |
line wrap: on
line source
## NNLS Chroma ## System identifier – vamp:matthiasm:nnls_chroma RDF URI – http://vamp-plugins.org/rdf/plugins/matthiasm#nnls_chroma (not yet available) ### General Description ### NNLS Chroma analyses a single channel of audio using frame-wise spectral input from the Vamp host. The plugin was originally developed to extract treble and bass chromagrams for subsequent use in chord extraction methods. The spectrum is transformed to a log-frequency spectrum (constant-Q) with three bins per semitone. On this representation, two processing steps are performed: * tuning, after which each centre bin (i.e. bin 2, 5, 8, ...) corresponds to a semitone, even if the tuning of the piece deviates from 440 Hz standard pitch. * running standardisation: subtraction of the running mean, division by the running standard deviation. This has a spectral whitening effect. The processed log-frequency spectrum is then used as an input for NNLS approximate transcription (using a dictionary of harmonic notes with geometrically decaying harmonics magnitudes). The output of the NNLS approximate transcription is semitone-spaced. To get the chroma, this semitone spectrum is multiplied (element-wise) with the desired profile (chroma or bass chroma) and then mapped to 12 bins. The resulting chroma frames can be normalised by (dividing by) their norm (L1, L2 and maximum norm available). ### Parameters ### The default settings (in brackets, below) are those used for Matthias Mauch's 2010 MIREX submissions. * spectral roll on (0.00 -- 0.05; default: 0.0): consider the cumulative energy spectrum (from low to high frequencies). All bins below the first bin whose cumulative energy exceeds [spectral roll on] x [total energy] will be set to 0. A value of 0 means that no bins will be changed. * tuning mode (global or local; default: global): local uses a local average for tuning, global uses ... exactly. * spectral whitening (0.0 -- 1.0; default: 1.0): determines how much the log-frequency spectrum is whitened. A value of 0.0 means no whitening. For values other than 0.0 the log-freq spectral bins are divided by [standard deviation of their neighbours]^[spectral whitening], where "^" means "to the power of". * spectral shape (0.5 -- 0.9; default: 0.7): the shape of the notes in the NNLS dictionary. Their harmonic amplitude follows a geometrically decreasing pattern, in which the i-th harmonic has an amplitude of [spectral shape]^[i-1], where "^" means "to the power of". * chroma normalisation (none, maximum norm, L1 norm, L2 norm; default: none): determines whether or how the chromagrams are normalised. If the setting is not 'none', then each chroma frame separately is divided by the chosen vector norm. Note that normalisation implies that the joint 24-dim. "Chroma and Bass Chromagram" output will be different from the individual 12-dim. "Chromagram" and "Bass Chromagram" outputs. ### Outputs ### ### References and Credits ###