annotate examples/vamp-example-plugins.txt @ 257:f80e34e36a79

...
author cannam
date Fri, 14 Nov 2008 12:22:41 +0000
parents 88ef5ffdbe8d
children
rev   line source
cannam@255 1
cannam@255 2 Vamp Example Plugins
cannam@255 3 ====================
cannam@255 4
cannam@257 5 The [vamp-example-plugins] library contains a number of Vamp audio
cannam@255 6 analysis plugins provided as part of the Vamp plugin SDK.
cannam@255 7
cannam@255 8 These are simple, but sometimes useful, plugins whose source code you
cannam@255 9 are free to study and reuse in any proprietary or non-proprietary
cannam@255 10 plugins of your own without any licensing obligation.
cannam@255 11
cannam@255 12 User documentation for the individual plugins in this library follows.
cannam@255 13
cannam@255 14 Amplitude Follower
cannam@255 15 ==================
cannam@255 16
cannam@257 17 *System identifier* -- [vamp-example-plugins:amplitudefollower] //
cannam@257 18 *RDF URI* -- http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#amplitudefollower
cannam@255 19
cannam@255 20 Amplitude Follower tracks and returns the amplitude of the audio
cannam@257 21 signal sample by sample, returning peak values block by block.
cannam@255 22
cannam@255 23 Parameters
cannam@255 24 ----------
cannam@255 25
cannam@257 26 *Attack time* (seconds) -- The 60dB convergence time for an increase in amplitude. //
cannam@257 27 *Release time* (seconds) -- The 60dB convergence time for a decrease in amplitude.
cannam@257 28
cannam@257 29 For example, if you feed the plugin with a simple step function that
cannam@257 30 jumps from level A to level B, then the output will start off as A,
cannam@257 31 then at the moment of stepping it will start to converge exponentially
cannam@257 32 to B, reaching with 60dB of the actual value within the time specified
cannam@257 33 by the Attack time parameter.
cannam@257 34
cannam@257 35 Similarly, if the plugin's input then steps down from B to A, the
cannam@257 36 output will start converging at the moment of stepping, reaching
cannam@257 37 within 60dB of the new value within the time specified by the Release
cannam@257 38 time parameter.
cannam@255 39
cannam@255 40 Outputs
cannam@255 41 -------
cannam@255 42 Amplitude
cannam@255 43 ~~~~~~~~~
cannam@257 44 The peak tracked amplitude (in volts) for the current processing block.
cannam@255 45
cannam@257 46 References and Credits
cannam@257 47 ----------------------
cannam@257 48 Amplitude Follower uses a method from the SuperCollider audio
cannam@257 49 processing language. It was implemented as a Vamp plugin by Dan
cannam@257 50 Stowell.
cannam@255 51
cannam@255 52
cannam@255 53 Simple Fixed Tempo Estimator
cannam@255 54 ============================
cannam@255 55
cannam@257 56 *System identifier* -- vamp-example-plugins:fixedtempo //
cannam@257 57 *RDF URI* -- http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#fixedtempo
cannam@255 58
cannam@255 59 Simple Fixed Tempo Estimator analyses a fragment of audio and
cannam@255 60 estimates its tempo. It assumes that its input is of fixed tempo, and
cannam@255 61 it analyses only the first (small but configurable number of) seconds
cannam@255 62 before returning a result, discarding all subsequent input.
cannam@255 63
cannam@255 64 The plugin calculates an overall energy rise function across a series
cannam@255 65 of short frequency-domain input frames, takes the autocorrelation of
cannam@255 66 this function, filters it to stress possible metrical patterns,
cannam@255 67 locates peaks, and converts from autocorrelation lag to the
cannam@255 68 corresponding tempo.
cannam@255 69
cannam@255 70 The filtering process involves searching for peaks at simple
cannam@255 71 metrically related intervals (at a given autocorrelation lag as well
cannam@255 72 as at 0.5, 2, and 4 times that lag), boosting each peak that shows
cannam@255 73 strong related peaks. A simplistic perceptual curve is also applied
cannam@255 74 in order to increase the probability of detecting a "likely" tempo.
cannam@255 75 For improved tempo precision, each tempo with strong related peaks is
cannam@255 76 averaged with the tempi calculated from those peaks.
cannam@255 77
cannam@257 78 The method is best suited for 4/4 pop and dance rhythms.
cannam@255 79
cannam@255 80 This plugin returns many of its intermediate calculations as
cannam@255 81 additional outputs, as well as the most favoured tempo. Although as a
cannam@255 82 tempo estimator it's still fairly primitive, it is intended to provide
cannam@255 83 a useful example of a slightly more complex feature extraction plugin
cannam@255 84 than the other examples, as well as one that returns several different
cannam@255 85 types of output at a time.
cannam@255 86
cannam@255 87 Parameters
cannam@255 88 ----------
cannam@255 89
cannam@257 90 *Minimum estimated tempo*, *Maximum estimated tempo* (bpm) -- These
cannam@255 91 parameters control the range of values within which the tempo
cannam@255 92 estimator will return its estimate.
cannam@255 93
cannam@257 94 *Input duration to study* (seconds) -- The tempo estimator uses only the
cannam@255 95 first part of its input, discarding any that follows. This parameter
cannam@255 96 controls how much input it will use. There is no value in increasing
cannam@255 97 this beyond 8x the duration of the slowest returned beat. The default
cannam@255 98 of 10 seconds is likely to be appropriate for most purposes.
cannam@255 99
cannam@255 100 Outputs
cannam@255 101 -------
cannam@255 102
cannam@255 103 Tempo
cannam@255 104 ~~~~~
cannam@255 105
cannam@255 106 The tempo estimator's best guess at the tempo of its input, in beats
cannam@255 107 per minute.
cannam@255 108
cannam@255 109 This is returned as a feature whose timestamp and duration cover the
cannam@255 110 range of the input which was used in estimating the tempo, with a
cannam@255 111 single value containing the tempo.
cannam@255 112
cannam@255 113 Tempo candidates
cannam@255 114 ~~~~~~~~~~~~~~~~
cannam@255 115
cannam@255 116 Several guesses at the possible tempo. This output is returned as a
cannam@255 117 single feature whose timestamp and duration cover the range of the
cannam@255 118 input which was used in estimating the tempo, with up to 10 bins
cannam@255 119 containing one tempo value in each bin, with the "best guess" tempo in
cannam@255 120 bin 0.
cannam@255 121
cannam@255 122 Detection function
cannam@255 123 ~~~~~~~~~~~~~~~~~~
cannam@255 124
cannam@255 125 The basic onset detection function used in tempo estimation.
cannam@255 126
cannam@255 127 Autocorrelation function
cannam@255 128 ~~~~~~~~~~~~~~~~~~~~~~~~
cannam@255 129
cannam@255 130 The autocorrelation of the onset detection function.
cannam@255 131
cannam@255 132 Filtered Autocorrelation
cannam@255 133 ~~~~~~~~~~~~~~~~~~~~~~~~
cannam@255 134
cannam@255 135 The autocorrelation after filtering to boost values with possible
cannam@255 136 metrically related peaks and to apply perceptual weighting. The peak
cannam@255 137 value of this function is the one that will be used as the "best
cannam@255 138 guess".
cannam@255 139
cannam@257 140 References and Credits
cannam@257 141 ----------------------
cannam@257 142 Simple Fixed Tempo Estimator uses a method derived from work by
cannam@257 143 Matthew Davies: see for example M. E. P. Davies and M. D. Plumbley,
cannam@257 144 _Beat Tracking With A Two State Model_, in Proceedings of the IEEE
cannam@257 145 International Conference on Acoustics, Speech and Signal Processing
cannam@257 146 2005. This plugin, made by Chris Cannam, is only an unsubtle
cannam@257 147 simplification of a very small part of the published method.
cannam@257 148
cannam@257 149 The Queen Mary plugin set
cannam@257 150 (http://www.elec.qmul.ac.uk/digitalmusic/downloads/index.html#qm-vamp-plugins)
cannam@257 151 contains a Tempo and Beat Tracker plugin by Matthew Davies providing a
cannam@257 152 more realistic implementation.
cannam@257 153
cannam@255 154
cannam@255 155 Simple Percussion Onset Detector
cannam@255 156 ================================
cannam@255 157
cannam@257 158 *System identifier* -- vamp-example-plugins:percussiononsets //
cannam@257 159 *RDF URI* -- http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#percussiononsets
cannam@255 160
cannam@255 161 Simple Percussion Onset Detector estimates the locations of percussive
cannam@257 162 onsets in the audio signal.
cannam@255 163
cannam@255 164 The principle is to exploit the broadband nature of noisy percussive
cannam@255 165 onsets by identifying only those frames in which the energy rise shows
cannam@255 166 a broadband profile.
cannam@255 167
cannam@255 168 The plugin takes a series of frequency domain frames, and examines
cannam@255 169 each frame to count the number of bins whose energy content has
cannam@255 170 increased by more than a certain threshold since the prior frame.
cannam@255 171 Frames in which this number is at a peak relative to prior and
cannam@255 172 following frames and also exceeds another threshold value are
cannam@255 173 classified as percussive onsets.
cannam@255 174
cannam@255 175 Parameters
cannam@255 176 ----------
cannam@255 177
cannam@257 178 *Energy rise threshold* (dB) -- The rise in energy within a bin from one
cannam@255 179 frame to the next that is required for a bin to be counted toward the
cannam@255 180 detection function's bin count. This roughly corresponds to how
cannam@255 181 "loud" a percussive sound must be in order to be detected.
cannam@255 182
cannam@257 183 *Sensitivity* (%) -- The proportion of bins that must exceed the energy
cannam@255 184 rise threshold in order for an onset to be detected (at frames in
cannam@255 185 which the detection function peaks). This roughly corresponds to how
cannam@255 186 "noisy" a percussive sound must be in order to be detected.
cannam@255 187
cannam@255 188 Outputs
cannam@255 189 -------
cannam@255 190
cannam@255 191 Onsets
cannam@255 192 ~~~~~~
cannam@255 193
cannam@255 194 The estimated onset locations.
cannam@255 195
cannam@255 196 Detection Function
cannam@255 197 ~~~~~~~~~~~~~~~~~~
cannam@255 198
cannam@255 199 The energy rise detection function whose peaks were used to estimate
cannam@255 200 onset locations.
cannam@255 201
cannam@257 202 References and Credits
cannam@257 203 ----------------------
cannam@257 204 The method used in Simple Percussion Onset Detector was described in
cannam@257 205 "Drum Source Separation using Percussive Feature Detection and
cannam@257 206 Spectral Modulation" by Dan Barry, Derry Fitzgerald, Eugene Coyle and
cannam@257 207 Bob Lawlor, ISSC 2005. The plugin was made by Chris Cannam.
cannam@257 208
cannam@255 209
cannam@255 210 Simple Power Spectrum
cannam@255 211 =====================
cannam@255 212
cannam@257 213 *System identifier* -- vamp-example-plugins:powerspectrum //
cannam@257 214 *RDF URI* -- http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#powerspectrum
cannam@255 215
cannam@255 216 Simple Power Spectrum returns a power spectrum calculated from
cannam@255 217 windowed short-time Fourier transforms of the input audio. (The power
cannam@255 218 spectrum for a frame consists of a sequence of the squares of the
cannam@255 219 magnitudes of the complex values for each frequency bin in the result
cannam@255 220 of the Fourier transform.)
cannam@255 221
cannam@255 222 This very simple plugin is an illustration of the fact that if a
cannam@255 223 plugin requests frequency-domain input, its input will already be in
cannam@255 224 the form needed for a spectrum such as this. The plugin has no work
cannam@255 225 left to do except to calculate the squared magnitude from the
cannam@255 226 cartesian complex representation.
cannam@255 227
cannam@255 228 This plugin also illustrates how to return "grid-type" visualisation
cannam@255 229 data from a Vamp plugin.
cannam@255 230
cannam@255 231 Parameters
cannam@255 232 ----------
cannam@255 233
cannam@255 234 None.
cannam@255 235
cannam@255 236 Outputs
cannam@255 237 -------
cannam@255 238
cannam@255 239 Power Spectrum
cannam@255 240 ~~~~~~~~~~~~~~
cannam@255 241
cannam@255 242 The power spectrum calculated from the input frame. This output
cannam@255 243 returns a single feature per processing block, containing
cannam@255 244 blocksize/2+1 power values corresponding to the FFT bins from DC to
cannam@255 245 Nyquist inclusive. The DC bin is always returned.
cannam@255 246
cannam@255 247
cannam@255 248 Spectral Centroid
cannam@255 249 =================
cannam@255 250
cannam@257 251 *System identifier* -- vamp-example-plugins:spectralcentroid //
cannam@257 252 *RDF URI* -- http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#spectralcentroid
cannam@255 253
cannam@255 254 Spectral Centroid calculates the "centre of gravity" of the frequency
cannam@255 255 spectrum for each input frame.
cannam@255 256
cannam@255 257 Parameters
cannam@255 258 ----------
cannam@255 259
cannam@255 260 None.
cannam@255 261
cannam@255 262 Outputs
cannam@255 263 -------
cannam@255 264
cannam@255 265 Log Frequency Centroid
cannam@255 266 ~~~~~~~~~~~~~~~~~~~~~~
cannam@255 267
cannam@255 268 The centroid of the log-weighted frequency spectrum. That is, the sum
cannam@255 269 across Fourier transform output bins of the logarithm of the bin
cannam@255 270 frequency multiplied by the bin magnitude, divided by the sum of the
cannam@255 271 bin magnitudes, and the inverse logarithm taken so as to give the
cannam@255 272 result as a frequency in Hz.
cannam@255 273
cannam@255 274 Linear Frequency Centroid
cannam@255 275 ~~~~~~~~~~~~~~~~~~~~~~~~~
cannam@255 276
cannam@255 277 The centroid of the linear-weighted frequency spectrum. That is, the
cannam@255 278 sum across Fourier transform output bins of the bin frequency
cannam@255 279 multiplied by the bin magnitude, divided by the sum of the bin
cannam@255 280 magnitudes. The result is a frequency in Hz.
cannam@255 281
cannam@255 282
cannam@255 283 Zero Crossings
cannam@255 284 ==============
cannam@255 285
cannam@257 286 *System identifier* -- vamp-example-plugins:zerocrossing //
cannam@257 287 *RDF URI* -- http://vamp-plugins.org/rdf/plugins/vamp-example-plugins#zerocrossing
cannam@255 288
cannam@255 289 Zero Crossings calculates the positions and density of "zero-crossing"
cannam@255 290 points in an audio waveform. For the purposes of this plugin, that
cannam@255 291 means those positions at which the sampled value switches from
cannam@255 292 zero-or-less to greater-than-zero, or vice versa.
cannam@255 293
cannam@255 294 Parameters
cannam@255 295 ----------
cannam@255 296
cannam@255 297 None.
cannam@255 298
cannam@255 299 Outputs
cannam@255 300 -------
cannam@255 301
cannam@255 302 Zero Crossing Counts
cannam@255 303 ~~~~~~~~~~~~~~~~~~~~
cannam@255 304
cannam@255 305 The number of zero-crossing points found in the current block of
cannam@255 306 samples, as a single-valued feature returned per processing block.
cannam@255 307
cannam@255 308 Zero Crossings
cannam@255 309 ~~~~~~~~~~~~~~
cannam@255 310
cannam@255 311 The locations of zero-crossing points, returning one feature
cannam@255 312 timestamped to the zero-crossing location, without values, for each
cannam@255 313 crossing point.
cannam@255 314